commit ef0d3d064edcce08a3d3609ecf7129e17ee130e7 Author: Ben Hutchings Date: Tue Nov 17 15:54:47 2015 +0000 Linux 3.2.73 commit a6826ecbeab9c832ed742653de895ad4de61c858 Author: David Howells Date: Thu Oct 15 17:21:37 2015 +0100 KEYS: Fix crash when attempt to garbage collect an uninstantiated keyring commit f05819df10d7b09f6d1eb6f8534a8f68e5a4fe61 upstream. The following sequence of commands: i=`keyctl add user a a @s` keyctl request2 keyring foo bar @t keyctl unlink $i @s tries to invoke an upcall to instantiate a keyring if one doesn't already exist by that name within the user's keyring set. However, if the upcall fails, the code sets keyring->type_data.reject_error to -ENOKEY or some other error code. When the key is garbage collected, the key destroy function is called unconditionally and keyring_destroy() uses list_empty() on keyring->type_data.link - which is in a union with reject_error. Subsequently, the kernel tries to unlink the keyring from the keyring names list - which oopses like this: BUG: unable to handle kernel paging request at 00000000ffffff8a IP: [] keyring_destroy+0x3d/0x88 ... Workqueue: events key_garbage_collector ... RIP: 0010:[] keyring_destroy+0x3d/0x88 RSP: 0018:ffff88003e2f3d30 EFLAGS: 00010203 RAX: 00000000ffffff82 RBX: ffff88003bf1a900 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 000000003bfc6901 RDI: ffffffff81a73a40 RBP: ffff88003e2f3d38 R08: 0000000000000152 R09: 0000000000000000 R10: ffff88003e2f3c18 R11: 000000000000865b R12: ffff88003bf1a900 R13: 0000000000000000 R14: ffff88003bf1a908 R15: ffff88003e2f4000 ... CR2: 00000000ffffff8a CR3: 000000003e3ec000 CR4: 00000000000006f0 ... Call Trace: [] key_gc_unused_keys.constprop.1+0x5d/0x10f [] key_garbage_collector+0x1fa/0x351 [] process_one_work+0x28e/0x547 [] worker_thread+0x26e/0x361 [] ? rescuer_thread+0x2a8/0x2a8 [] kthread+0xf3/0xfb [] ? kthread_create_on_node+0x1c2/0x1c2 [] ret_from_fork+0x3f/0x70 [] ? kthread_create_on_node+0x1c2/0x1c2 Note the value in RAX. This is a 32-bit representation of -ENOKEY. The solution is to only call ->destroy() if the key was successfully instantiated. Reported-by: Dmitry Vyukov Signed-off-by: David Howells Tested-by: Dmitry Vyukov [carnil: Backported for 3.2: adjust context] Signed-off-by: Ben Hutchings commit 650f6aa8c3c805bd41c4243aadbd63558f39fd32 Author: David Howells Date: Fri Sep 25 16:30:08 2015 +0100 KEYS: Fix race between key destruction and finding a keyring by name commit 94c4554ba07adbdde396748ee7ae01e86cf2d8d7 upstream. There appears to be a race between: (1) key_gc_unused_keys() which frees key->security and then calls keyring_destroy() to unlink the name from the name list (2) find_keyring_by_name() which calls key_permission(), thus accessing key->security, on a key before checking to see whether the key usage is 0 (ie. the key is dead and might be cleaned up). Fix this by calling ->destroy() before cleaning up the core key data - including key->security. Reported-by: Petr Matousek Signed-off-by: David Howells [carnil: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 3553e5d34d72a3aac5d967ec8b4d45a88340d679 Author: Eric Northup Date: Tue Nov 3 18:03:53 2015 +0100 KVM: x86: work around infinite loop in microcode when #AC is delivered commit 54a20552e1eae07aa240fa370a0293e006b5faed upstream. It was found that a guest can DoS a host by triggering an infinite stream of "alignment check" (#AC) exceptions. This causes the microcode to enter an infinite loop where the core never receives another interrupt. The host kernel panics pretty quickly due to the effects (CVE-2015-5307). Signed-off-by: Eric Northup Signed-off-by: Paolo Bonzini [bwh: Backported to 3.2: - Add definition of AC_VECTOR - Adjust filename, context] Signed-off-by: Ben Hutchings commit e94e60d82b9fd6c592bfe7b939a991bfe98179ae Author: Olga Kornievskaia Date: Mon Sep 14 19:54:36 2015 -0400 Failing to send a CLOSE if file is opened WRONLY and server reboots on a 4.x mount commit a41cbe86df3afbc82311a1640e20858c0cd7e065 upstream. A test case is as the description says: open(foobar, O_WRONLY); sleep() --> reboot the server close(foobar) The bug is because in nfs4state.c in nfs4_reclaim_open_state() a few line before going to restart, there is clear_bit(NFS4CLNT_RECLAIM_NOGRACE, &state->flags). NFS4CLNT_RECLAIM_NOGRACE is a flag for the client states not open owner states. Value of NFS4CLNT_RECLAIM_NOGRACE is 4 which is the value of NFS_O_WRONLY_STATE in nfs4_state->flags. So clearing it wipes out state and when we go to close it, “call_close” doesn’t get set as state flag is not set and CLOSE doesn’t go on the wire. Signed-off-by: Olga Kornievskaia Signed-off-by: Trond Myklebust Signed-off-by: Ben Hutchings commit 11eea7a9dd9213b23a001d43000a2c60ccf707f4 Author: Charles Keepax Date: Thu Nov 6 15:49:41 2014 +0000 asix: Do full reset during ax88772_bind [ Upstream commit 436c2a5036b6ffe813310df2cf327d3b69be0734 ] commit 3cc81d85ee01 ("asix: Don't reset PHY on if_up for ASIX 88772") causes the ethernet on Arndale to no longer function. This appears to be because the Arndale ethernet requires a full reset before it will function correctly, however simply reverting the above patch causes problems with ethtool settings getting reset. It seems the problem is that the ethernet is not properly reset during bind, and indeed the code in ax88772_bind that resets the device is a very small subset of the actual ax88772_reset function. This patch uses ax88772_reset in place of the existing reset code in ax88772_bind which removes some code duplication and fixes the ethernet on Arndale. It is still possible that the original patch causes some issues with suspend and resume but that seems like a separate issue and I haven't had a chance to test that yet. Signed-off-by: Charles Keepax Tested-by: Riku Voipio Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings commit 41700e5b773b0090bec9321445dcb9e6e12d6629 Author: Michel Stam Date: Thu Oct 2 10:22:02 2014 +0200 asix: Don't reset PHY on if_up for ASIX 88772 [ Upstream commit 3cc81d85ee01e5a0b7ea2f4190e2ed1165f53c31 ] I've noticed every time the interface is set to 'up,', the kernel reports that the link speed is set to 100 Mbps/Full Duplex, even when ethtool is used to set autonegotiation to 'off', half duplex, 10 Mbps. It can be tested by: ifconfig eth0 down ethtool -s eth0 autoneg off speed 10 duplex half ifconfig eth0 up Then checking 'dmesg' for the link speed. Signed-off-by: Michel Stam Signed-off-by: David S. Miller [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings commit 68c3e59aa9cdf2d8870d8fbe4f37b1a509d0abeb Author: Joe Perches Date: Wed Oct 14 01:09:40 2015 -0700 ethtool: Use kcalloc instead of kmalloc for ethtool_get_strings [ Upstream commit 077cb37fcf6f00a45f375161200b5ee0cd4e937b ] It seems that kernel memory can leak into userspace by a kmalloc, ethtool_get_strings, then copy_to_user sequence. Avoid this by using kcalloc to zero fill the copied buffer. Signed-off-by: Joe Perches Acked-by: Ben Hutchings Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit a5e14d9fd0d6c7696207e95468ef01f790556a29 Author: Pravin B Shelar Date: Mon Sep 28 17:24:25 2015 -0700 skbuff: Fix skb checksum partial check. [ Upstream commit 31b33dfb0a144469dd805514c9e63f4993729a48 ] Earlier patch 6ae459bda tried to detect void ckecksum partial skb by comparing pull length to checksum offset. But it does not work for all cases since checksum-offset depends on updates to skb->data. Following patch fixes it by validating checksum start offset after skb-data pointer is updated. Negative value of checksum offset start means there is no need to checksum. Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull") Reported-by: Andrew Vagin Signed-off-by: Pravin B Shelar Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit c3321c4eafd33e3fda7a267776a9be6c2b5043f6 Author: Pravin B Shelar Date: Tue Sep 22 12:57:53 2015 -0700 skbuff: Fix skb checksum flag on skb pull [ Upstream commit 6ae459bdaaeebc632b16e54dcbabb490c6931d61 ] VXLAN device can receive skb with checksum partial. But the checksum offset could be in outer header which is pulled on receive. This results in negative checksum offset for the skb. Such skb can cause the assert failure in skb_checksum_help(). Following patch fixes the bug by setting checksum-none while pulling outer header. Following is the kernel panic msg from old kernel hitting the bug. ------------[ cut here ]------------ kernel BUG at net/core/dev.c:1906! RIP: 0010:[] skb_checksum_help+0x144/0x150 Call Trace: [] queue_userspace_packet+0x408/0x470 [openvswitch] [] ovs_dp_upcall+0x5d/0x60 [openvswitch] [] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch] [] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch] [] ovs_vport_receive+0x2a/0x30 [openvswitch] [] vxlan_rcv+0x53/0x60 [openvswitch] [] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch] [] udp_queue_rcv_skb+0x2dc/0x3b0 [] __udp4_lib_rcv+0x1cf/0x6c0 [] udp_rcv+0x1a/0x20 [] ip_local_deliver_finish+0xdd/0x280 [] ip_local_deliver+0x88/0x90 [] ip_rcv_finish+0x10d/0x370 [] ip_rcv+0x235/0x300 [] __netif_receive_skb+0x55d/0x620 [] netif_receive_skb+0x80/0x90 [] virtnet_poll+0x555/0x6f0 [] net_rx_action+0x134/0x290 [] __do_softirq+0xa8/0x210 [] call_softirq+0x1c/0x30 [] do_softirq+0x65/0xa0 [] irq_exit+0x8e/0xb0 [] do_IRQ+0x63/0xe0 [] common_interrupt+0x6e/0x6e Reported-by: Anupam Chanda Signed-off-by: Pravin B Shelar Acked-by: Tom Herbert Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 127500d724f8c43f452610c9080444eedb5eaa6c Author: Sabrina Dubroca Date: Thu Oct 15 14:25:03 2015 +0200 net: add length argument to skb_copy_and_csum_datagram_iovec Without this length argument, we can read past the end of the iovec in memcpy_toiovec because we have no way of knowing the total length of the iovec's buffers. This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb csum races when peeking") has been backported but that don't have the ioviter conversion, which is almost all the stable trees <= 3.18. This also fixes a kernel crash for NFS servers when the client uses -onfsvers=3,proto=udp to mount the export. Signed-off-by: Sabrina Dubroca Reviewed-by: Hannes Frederic Sowa [bwh: Backported to 3.2: adjust context in include/linux/skbuff.h] Signed-off-by: Ben Hutchings commit 4421196453ad90ff54c97b5d600d65464a3965b2 Author: Richard Guy Briggs Date: Sun Mar 16 14:00:19 2014 -0400 sched: declare pid_alive as inline commit 80e0b6e8a001361316a2d62b748fe677ec46b860 upstream. We accidentally declared pid_alive without any extern/inline connotation. Some platforms were fine with this, some like ia64 and mips were very angry. If the function is inline, the prototype should be inline! on ia64: include/linux/sched.h:1718: warning: 'pid_alive' declared inline after being called Signed-off-by: Richard Guy Briggs Signed-off-by: Eric Paris Signed-off-by: Ben Hutchings Cc: Neal Gompa commit cc1875ecbc3c9fb2774097e03870280c91c1e0e1 Author: Dāvis Mosāns Date: Fri Aug 21 07:29:22 2015 +0300 mvsas: Fix NULL pointer dereference in mvs_slot_task_free commit 2280521719e81919283b82902ac24058f87dfc1b upstream. When pci_pool_alloc fails in mvs_task_prep then task->lldd_task stays NULL but it's later used in mvs_abort_task as slot which is passed to mvs_slot_task_free causing NULL pointer dereference. Just return from mvs_slot_task_free when passed with NULL slot. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=101891 Signed-off-by: Dāvis Mosāns Reviewed-by: Tomas Henzl Reviewed-by: Johannes Thumshirn Signed-off-by: James Bottomley Signed-off-by: Ben Hutchings commit 965d8d1de4c6c0dddc2f031bd09d8415f0521f3f Author: NeilBrown Date: Sat Oct 24 16:23:48 2015 +1100 md/raid10: don't clear bitmap bit when bad-block-list write fails. commit c340702ca26a628832fade4f133d8160a55c29cc upstream. When a write fails and a bad-block-list is present, we can update the bad-block-list instead of writing the data. If this succeeds then it is OK clear the relevant bitmap-bit as no further 'sync' of the block is needed. However if writing the bad-block-list fails then we need to treat the write as failed and particularly must not clear the bitmap bit. Otherwise the device can be re-added (after any hardware connection issues are resolved) and because the relevant bit in the bitmap is clear, that block will not be resynced. This leads to data corruption. We already delay the final bio_endio() on the write until the bad-block-list is written so that when the write returns: either that data is safe, the bad-block record is safe, or the fact that the device is faulty is safe. However we *don't* delay the clearing of the bitmap, so the bitmap bit can be recorded as cleared before we know if the bad-block-list was written safely. So: delay that until the write really is safe. i.e. move the call to close_write() until just before calling bio_endio(), and recheck the 'is array degraded' status before making that call. This bug goes back to v3.1 when bad-block-lists were introduced, though it only affects arrays created with mdadm-3.3 or later as only those have bad-block lists. Backports will require at least Commit: 95af587e95aa ("md/raid10: ensure device failure recorded before write request returns.") as well. I'll send that to 'stable' separately. Note that of the two tests of R10BIO_WriteError that this patch adds, the first is certain to fail and the second is certain to succeed. However doing it this way makes the patch more obviously correct. I will tidy the code up in a future merge window. Reported-by: Nate Dailey Fixes: bd870a16c594 ("md/raid10: Handle write errors by updating badblock log.") Signed-off-by: NeilBrown Signed-off-by: Ben Hutchings commit c1fba1c813b9135565aa01bd8470b94f315078ac Author: NeilBrown Date: Fri Aug 14 11:26:17 2015 +1000 md/raid10: ensure device failure recorded before write request returns. commit 95af587e95aacb9cfda4a9641069a5244a540dc8 upstream. When a write to one of the legs of a RAID10 fails, the failure is recorded in the metadata of the other legs so that after a restart the data on the failed drive wont be trusted even if that drive seems to be working again (maybe a cable was unplugged). Currently there is no interlock between the write request completing and the metadata update. So it is possible that the write will complete, the app will confirm success in some way, and then the machine will crash before the metadata update completes. This is an extremely small hole for a racy to fit in, but it is theoretically possible and so should be closed. So: - set MD_CHANGE_PENDING when requesting a metadata update for a failed device, so we can know with certainty when it completes - queue requests that experienced an error on a new queue which is only processed after the metadata update completes - call raid_end_bio_io() on bios in that queue when the time comes. Signed-off-by: NeilBrown [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 1f6c748a9d212708e2e0b5a43570fc22fc856689 Author: NeilBrown Date: Sat Oct 24 16:02:16 2015 +1100 md/raid1: don't clear bitmap bit when bad-block-list write fails. commit bd8688a199b864944bf62eebed0ca13b46249453 upstream. When a write fails and a bad-block-list is present, we can update the bad-block-list instead of writing the data. If this succeeds then it is OK clear the relevant bitmap-bit as no further 'sync' of the block is needed. However if writing the bad-block-list fails then we need to treat the write as failed and particularly must not clear the bitmap bit. Otherwise the device can be re-added (after any hardware connection issues are resolved) and because the relevant bit in the bitmap is clear, that block will not be resynced. This leads to data corruption. We already delay the final bio_endio() on the write until the bad-block-list is written so that when the write returns: either that data is safe, the bad-block record is safe, or the fact that the device is faulty is safe. However we *don't* delay the clearing of the bitmap, so the bitmap bit can be recorded as cleared before we know if the bad-block-list was written safely. So: delay that until the write really is safe. i.e. move the call to close_write() until just before calling bio_endio(), and recheck the 'is array degraded' status before making that call. This bug goes back to v3.1 when bad-block-lists were introduced, though it only affects arrays created with mdadm-3.3 or later as only those have bad-block lists. Backports will require at least Commit: 55ce74d4bfe1 ("md/raid1: ensure device failure recorded before write request returns.") as well. I'll send that to 'stable' separately. Note that of the two tests of R1BIO_WriteError that this patch adds, the first is certain to fail and the second is certain to succeed. However doing it this way makes the patch more obviously correct. I will tidy the code up in a future merge window. Reported-and-tested-by: Nate Dailey Cc: Jes Sorensen Fixes: cd5ff9a16f08 ("md/raid1: Handle write errors by updating badblock log.") Signed-off-by: NeilBrown Signed-off-by: Ben Hutchings commit 6a1281c38ae60b71e308d818c2668a6b8bd4897b Author: NeilBrown Date: Fri Aug 14 11:11:10 2015 +1000 md/raid1: ensure device failure recorded before write request returns. commit 55ce74d4bfe1b9444436264c637f39a152d1e5ac upstream. When a write to one of the legs of a RAID1 fails, the failure is recorded in the metadata of the other leg(s) so that after a restart the data on the failed drive wont be trusted even if that drive seems to be working again (maybe a cable was unplugged). Similarly when we record a bad-block in response to a write failure, we must not let the write complete until the bad-block update is safe. Currently there is no interlock between the write request completing and the metadata update. So it is possible that the write will complete, the app will confirm success in some way, and then the machine will crash before the metadata update completes. This is an extremely small hole for a racy to fit in, but it is theoretically possible and so should be closed. So: - set MD_CHANGE_PENDING when requesting a metadata update for a failed device, so we can know with certainty when it completes - queue requests that experienced an error on a new queue which is only processed after the metadata update completes - call raid_end_bio_io() on bios in that queue when the time comes. Signed-off-by: NeilBrown [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 12d1c67b7b482bea0503fb9e42e9d257498b5c32 Author: Mike Snitzer Date: Thu Oct 22 10:56:40 2015 -0400 dm btree: fix leak of bufio-backed block in btree_split_beneath error path commit 4dcb8b57df3593dcb20481d9d6cf79d1dc1534be upstream. btree_split_beneath()'s error path had an outstanding FIXME that speaks directly to the potential for _not_ cleaning up a previously allocated bufio-backed block. Fix this by releasing the previously allocated bufio block using unlock_block(). Reported-by: Mikulas Patocka Signed-off-by: Mike Snitzer Acked-by: Joe Thornber Signed-off-by: Ben Hutchings commit 11020754df47e783aa3908c2fbf33318b05b744e Author: Joe Thornber Date: Wed Oct 21 18:36:49 2015 +0100 dm btree remove: fix a bug when rebalancing nodes after removal commit 2871c69e025e8bc507651d5a9cf81a8a7da9d24b upstream. Commit 4c7e309340ff ("dm btree remove: fix bug in redistribute3") wasn't a complete fix for redistribute3(). The redistribute3 function takes 3 btree nodes and shares out the entries evenly between them. If the three nodes in total contained (MAX_ENTRIES * 3) - 1 entries between them then this was erroneously getting rebalanced as (MAX_ENTRIES - 1) on the left and right, and (MAX_ENTRIES + 1) in the center. Fix this issue by being more careful about calculating the target number of entries for the left and right nodes. Unit tested in userspace using this program: https://github.com/jthornber/redistribute3-test/blob/master/redistribute3_t.c Signed-off-by: Joe Thornber Signed-off-by: Mike Snitzer Signed-off-by: Ben Hutchings commit e3e62cc7abb53bc0317be8b3a0ba98b36768630d Author: Guillaume Nault Date: Thu Oct 22 16:57:10 2015 +0200 ppp: fix pppoe_dev deletion condition in pppoe_release() commit 1acea4f6ce1b1c0941438aca75dd2e5c6b09db60 upstream. We can't rely on PPPOX_ZOMBIE to decide whether to clear po->pppoe_dev. PPPOX_ZOMBIE can be set by pppoe_disc_rcv() even when po->pppoe_dev is NULL. So we have no guarantee that (sk->sk_state & PPPOX_ZOMBIE) implies (po->pppoe_dev != NULL). Since we're releasing a PPPoE socket, we want to release the pppoe_dev if it exists and reset sk_state to PPPOX_DEAD, no matter the previous value of sk_state. So we can just check for po->pppoe_dev and avoid any assumption on sk->sk_state. Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release") Signed-off-by: Guillaume Nault Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit 279fc860b960bfa4bff4083607a5647d3b0982cb Author: Jan Kara Date: Thu Oct 22 13:32:21 2015 -0700 mm: make sendfile(2) killable commit 296291cdd1629c308114504b850dc343eabc2782 upstream. Currently a simple program below issues a sendfile(2) system call which takes about 62 days to complete in my test KVM instance. int fd; off_t off = 0; fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644); ftruncate(fd, 2); lseek(fd, 0, SEEK_END); sendfile(fd, fd, &off, 0xfffffff); Now you should not ask kernel to do a stupid stuff like copying 256MB in 2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin should have a way to stop you. We actually do have a check for fatal_signal_pending() in generic_perform_write() which triggers in this path however because we always succeed in writing something before the check is done, we return value > 0 from generic_perform_write() and thus the information about signal gets lost. Fix the problem by doing the signal check before writing anything. That way generic_perform_write() returns -EINTR, the error gets propagated up and the sendfile loop terminates early. Signed-off-by: Jan Kara Reported-by: Dmitry Vyukov Cc: Al Viro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit 08fd1afd90a71c8e061061c7153808b36cf6b8f7 Author: Vasant Hegde Date: Fri Oct 16 15:53:29 2015 +0530 powerpc/rtas: Validate rtas.entry before calling enter_rtas() commit 8832317f662c06f5c06e638f57bfe89a71c9b266 upstream. Currently we do not validate rtas.entry before calling enter_rtas(). This leads to a kernel oops when user space calls rtas system call on a powernv platform (see below). This patch adds code to validate rtas.entry before making enter_rtas() call. Oops: Exception in kernel mode, sig: 4 [#1] SMP NR_CPUS=1024 NUMA PowerNV task: c000000004294b80 ti: c0000007e1a78000 task.ti: c0000007e1a78000 NIP: 0000000000000000 LR: 0000000000009c14 CTR: c000000000423140 REGS: c0000007e1a7b920 TRAP: 0e40 Not tainted (3.18.17-340.el7_1.pkvm3_1_0.2400.1.ppc64le) MSR: 1000000000081000 CR: 00000000 XER: 00000000 CFAR: c000000000009c0c SOFTE: 0 NIP [0000000000000000] (null) LR [0000000000009c14] 0x9c14 Call Trace: [c0000007e1a7bba0] [c00000000041a7f4] avc_has_perm_noaudit+0x54/0x110 (unreliable) [c0000007e1a7bd80] [c00000000002ddc0] ppc_rtas+0x150/0x2d0 [c0000007e1a7be30] [c000000000009358] syscall_exit+0x0/0x98 Fixes: 55190f88789a ("powerpc: Add skeleton PowerNV platform") Reported-by: NAGESWARA R. SASTRY Signed-off-by: Vasant Hegde [mpe: Reword change log, trim oops, and add stable + fixes] Signed-off-by: Michael Ellerman Signed-off-by: Ben Hutchings commit f996f92d5e47ade54adca8b30a572b8268ca5949 Author: Ilia Mirkin Date: Tue Oct 20 01:15:39 2015 -0400 drm/nouveau/gem: return only valid domain when there's only one commit 2a6c521bb41ce862e43db46f52e7681d33e8d771 upstream. On nv50+, we restrict the valid domains to just the one where the buffer was originally created. However after the buffer is evicted to system memory, we might move it back to a different domain that was not originally valid. When sharing the buffer and retrieving its GEM_INFO data, we still want the domain that will be valid for this buffer in a pushbuf, not the one where it currently happens to be. This resolves fdo#92504 and several others. These are due to suspend evicting all buffers, making it more likely that they temporarily end up in the wrong place. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92504 Signed-off-by: Ilia Mirkin Signed-off-by: Ben Skeggs Signed-off-by: Ben Hutchings commit 1ae4e01f97bad389752e168d7438c513ea80e95d Author: Doron Tsur Date: Sun Oct 11 15:58:17 2015 +0300 IB/cm: Fix rb-tree duplicate free and use-after-free commit 0ca81a2840f77855bbad1b9f172c545c4dc9e6a4 upstream. ib_send_cm_sidr_rep could sometimes erase the node from the sidr (depending on errors in the process). Since ib_send_cm_sidr_rep is called both from cm_sidr_req_handler and cm_destroy_id, cm_id_priv could be either erased from the rb_tree twice or not erased at all. Fixing that by making sure it's erased only once before freeing cm_id_priv. Fixes: a977049dacde ('[PATCH] IB: Add the kernel CM implementation') Signed-off-by: Doron Tsur Signed-off-by: Matan Barak Signed-off-by: Doug Ledford Signed-off-by: Ben Hutchings commit d2420ed8657f3943408fd8a4c7c20d1fd31c9563 Author: Charles Keepax Date: Tue Oct 20 10:25:58 2015 +0100 ASoC: wm8904: Correct number of EQ registers commit 97aff2c03a1e4d343266adadb52313613efb027f upstream. There are 24 EQ registers not 25, I suspect this bug came about because the registers start at EQ1 not zero. The bug is relatively harmless as the extra register written is an unused one. Signed-off-by: Charles Keepax Signed-off-by: Mark Brown Signed-off-by: Ben Hutchings commit ff4b4b7e746fec864262066eadcc66f7e28575f8 Author: Herbert Xu Date: Mon Oct 19 18:23:57 2015 +0800 crypto: api - Only abort operations on fatal signal commit 3fc89adb9fa4beff31374a4bf50b3d099d88ae83 upstream. Currently a number of Crypto API operations may fail when a signal occurs. This causes nasty problems as the caller of those operations are often not in a good position to restart the operation. In fact there is currently no need for those operations to be interrupted by user signals at all. All we need is for them to be killable. This patch replaces the relevant calls of signal_pending with fatal_signal_pending, and wait_for_completion_interruptible with wait_for_completion_killable, respectively. Signed-off-by: Herbert Xu [bwh: Backported to 3.2: drop change to crypto_user_skcipher_alg(), which we don't have] Signed-off-by: Ben Hutchings commit c79e716dd2eb51b353b7e4e71b9680627b069fb9 Author: Laura Abbott Date: Mon Oct 12 11:30:13 2015 +0300 xhci: Add spurious wakeup quirk for LynxPoint-LP controllers commit fd7cd061adcf5f7503515ba52b6a724642a839c8 upstream. We received several reports of systems rebooting and powering on after an attempted shutdown. Testing showed that setting XHCI_SPURIOUS_WAKEUP quirk in addition to the XHCI_SPURIOUS_REBOOT quirk allowed the system to shutdown as expected for LynxPoint-LP xHCI controllers. Set the quirk back. Note that the quirk was originally introduced for LynxPoint and LynxPoint-LP just for this same reason. See: commit 638298dc66ea ("xhci: Fix spurious wakeups after S5 on Haswell") It was later limited to only concern HP machines as it caused regression on some machines, see both bug and commit: Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66171 commit 6962d914f317 ("xhci: Limit the spurious wakeup fix only to HP machines") Later it was discovered that the powering on after shutdown was limited to LynxPoint-LP (Haswell-ULT) and that some non-LP HP machine suffered from spontaneous resume from S3 (which should not be related to the SPURIOUS_WAKEUP quirk at all). An attempt to fix this then removed the SPURIOUS_WAKEUP flag usage completely. commit b45abacde3d5 ("xhci: no switching back on non-ULT Haswell") Current understanding is that LynxPoint-LP (Haswell ULT) machines need the SPURIOUS_WAKEUP quirk, otherwise they will restart, and plain Lynxpoint (Haswell) machines may _not_ have the quirk set otherwise they again will restart. Signed-off-by: Laura Abbott Cc: Takashi Iwai Cc: Oliver Neukum [Added more history to commit message -Mathias] Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 77156ac073a4946d7d59bd1f3b78847612151308 Author: Denis Turischev Date: Fri Apr 25 19:20:14 2014 +0300 xhci: Switch Intel Lynx Point LP ports to EHCI on shutdown. commits c09ec25d3684cad74d851c0f028a495999591279 and 0a939993bff117d3657108ca13b011fc0378aedb upstream. The same issue like with Panther Point chipsets. If the USB ports are switched to xHCI on shutdown, the xHCI host will send a spurious interrupt, which will wake the system. Some BIOS have work around for this, but not all. One example is Compulab's mini-desktop, the Intense-PC2. The bug can be avoided if the USB ports are switched back to EHCI on shutdown. This patch should be backported to stable kernels as old as 3.12, that contain the commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 "xhci: Fix spurious wakeups after S5 on Haswell" Signed-off-by: Denis Turischev Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman Patch "xhci: Switch Intel Lynx Point ports to EHCI on shutdown." commit c09ec25d3684cad74d851c0f028a495999591279 is not fully correct It switches both Lynx Point and Lynx Point-LP ports to EHCI on shutdown. On some Lynx Point machines it causes spurious interrupt, which wake the system: bugzilla.kernel.org/show_bug.cgi?id=76291 On Lynx Point-LP on the contrary switching ports to EHCI seems to be necessary to fix these spurious interrupts. Signed-off-by: Denis Turischev Reported-by: Wulf Richartz Cc: Mathias Nyman Signed-off-by: Greg Kroah-Hartman [bwh: Combined the above commits and backported to 3.2: adjust context to apply after "xhci: Limit the spurious wakeup fix only to HP machines" and "xhci: no switching back on non-ULT Haswell"] Signed-off-by: Ben Hutchings commit c35a75f2913c3aa3ac9d43683c8fccddde16b164 Author: Mathias Nyman Date: Mon Oct 12 11:30:12 2015 +0300 xhci: handle no ping response error properly commit 3b4739b8951d650becbcd855d7d6f18ac98a9a85 upstream. If a host fails to wake up a isochronous SuperSpeed device from U1/U2 in time for a isoch transfer it will generate a "No ping response error" Host will then move to the next transfer descriptor. Handle this case in the same way as missed service errors, tag the current TD as skipped and handle it on the next transfer event. Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 7c6aca1947a312bcc23dc8e1000e4f13c7b43555 Author: Mathias Nyman Date: Mon Oct 12 11:30:11 2015 +0300 xhci: don't finish a TD if we get a short transfer event mid TD commit e210c422b6fdd2dc123bedc588f399aefd8bf9de upstream. If the difference is big enough between the bytes asked and received in a bulk transfer we can get a short transfer event pointing to a TRB in the middle of the TD. We don't want to handle the TD yet as we will anyway receive a new event for the last TRB in the TD. Hold off from finishing the TD and removing it from the list until we receive an event for the last TRB in the TD Signed-off-by: Mathias Nyman Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit cca532890dad576fbe5cbff1ba9411d710292a9a Author: Christian Zander Date: Wed Jun 10 09:41:45 2015 -0700 iommu/vt-d: fix range computation when making room for large pages commit ba2374fd2bf379f933773811fdb06cb6a5445f41 upstream. In preparation for the installation of a large page, any small page tables that may still exist in the target IOV address range are removed. However, if a scatter/gather list entry is large enough to fit more than one large page, the address space for any subsequent large pages is not cleared of conflicting small page tables. This can cause legitimate mapping requests to fail with errors of the form below, potentially followed by a series of IOMMU faults: ERROR: DMA PTE for vPFN 0xfde00 already set (to 7f83a4003 not 7e9e00083) In this example, a 4MiB scatter/gather list entry resulted in the successful installation of a large page @ vPFN 0xfdc00, followed by a failed attempt to install another large page @ vPFN 0xfde00, due to the presence of a pointer to a small page table @ 0x7f83a4000. To address this problem, compute the number of large pages that fit into a given scatter/gather list entry, and use it to derive the last vPFN covered by the large page(s). Signed-off-by: Christian Zander Signed-off-by: David Woodhouse [bwh: Backported to 3.2: - Add the lvl_pages variable, added by an earlier commit upstream - Also change arguments to dma_pte_clear_range(), which is called by dma_pte_free_pagetable() upstream] Signed-off-by: Ben Hutchings commit 44a4ec5c68bbb8099ed82acf508f78724bc6ab92 Author: Russell King Date: Fri Oct 9 20:43:33 2015 +0100 crypto: ahash - ensure statesize is non-zero commit 8996eafdcbad149ac0f772fb1649fbb75c482a6a upstream. Unlike shash algorithms, ahash drivers must implement export and import as their descriptors may contain hardware state and cannot be exported as is. Unfortunately some ahash drivers did not provide them and end up causing crashes with algif_hash. This patch adds a check to prevent these drivers from registering ahash algorithms until they are fixed. Signed-off-by: Russell King Signed-off-by: Herbert Xu Signed-off-by: Ben Hutchings commit 680137aed96ddd825fcacbe84dcfd07a183eca79 Author: David Henningsson Date: Tue Oct 13 10:10:18 2015 +0200 ALSA: hda - Fix inverted internal mic on Lenovo G50-80 commit e8d65a8d985271a102f07c7456da5b86c19ffe16 upstream. Add the appropriate quirk to indicate the Lenovo G50-80 has a stereo mic input where one channel has reverse polarity. Alsa-info available at: https://launchpadlibrarian.net/220846272/AlsaInfo.txt BugLink: https://bugs.launchpad.net/bugs/1504778 Signed-off-by: David Henningsson Signed-off-by: Takashi Iwai [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 26e6ab3e52ef3f4399bb7e94eb71c553cace88fe Author: Cathy Avery Date: Fri Oct 2 09:35:01 2015 -0400 xen-blkfront: check for null drvdata in blkback_changed (XenbusStateClosing) commit a54c8f0f2d7df525ff997e2afe71866a1a013064 upstream. xen-blkfront will crash if the check to talk_to_blkback() in blkback_changed()(XenbusStateInitWait) returns an error. The driver data is freed and info is set to NULL. Later during the close process via talk_to_blkback's call to xenbus_dev_fatal() the null pointer is passed to and dereference in blkfront_closing. Signed-off-by: Cathy Avery Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Ben Hutchings commit 677ee63365fecb0d15df00fc6bd282b0ced7eaef Author: Christoph Hellwig Date: Sat Oct 3 19:16:07 2015 +0200 3w-9xxx: don't unmap bounce buffered commands commit 15e3d5a285ab9283136dba34bbf72886d9146706 upstream. 3w controller don't dma map small single SGL entry commands but instead bounce buffer them. Add a helper to identify these commands and don't call scsi_dma_unmap for them. Based on an earlier patch from James Bottomley. Fixes: 118c85 ("3w-9xxx: fix command completion race") Reported-by: Tóth Attila Tested-by: Tóth Attila Signed-off-by: Christoph Hellwig Acked-by: Adam Radford Signed-off-by: James Bottomley Signed-off-by: Ben Hutchings commit d4d8ceb528b8d241f8cfe95759ef6afb2e2e262f Author: Peter Zijlstra Date: Tue Sep 29 14:45:09 2015 +0200 sched/core: Fix TASK_DEAD race in finish_task_switch() commit 95913d97914f44db2b81271c2e2ebd4d2ac2df83 upstream. So the problem this patch is trying to address is as follows: CPU0 CPU1 context_switch(A, B) ttwu(A) LOCK A->pi_lock A->on_cpu == 0 finish_task_switch(A) prev_state = A->state <-. WMB | A->on_cpu = 0; | UNLOCK rq0->lock | | context_switch(C, A) `-- A->state = TASK_DEAD prev_state == TASK_DEAD put_task_struct(A) context_switch(A, C) finish_task_switch(A) A->state == TASK_DEAD put_task_struct(A) The argument being that the WMB will allow the load of A->state on CPU0 to cross over and observe CPU1's store of A->state, which will then result in a double-drop and use-after-free. Now the comment states (and this was true once upon a long time ago) that we need to observe A->state while holding rq->lock because that will order us against the wakeup; however the wakeup will not in fact acquire (that) rq->lock; it takes A->pi_lock these days. We can obviously fix this by upgrading the WMB to an MB, but that is expensive, so we'd rather avoid that. The alternative this patch takes is: smp_store_release(&A->on_cpu, 0), which avoids the MB on some archs, but not important ones like ARM. Reported-by: Oleg Nesterov Signed-off-by: Peter Zijlstra (Intel) Acked-by: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-kernel@vger.kernel.org Cc: manfred@colorfullife.com Cc: will.deacon@arm.com Fixes: e4a52bcb9a18 ("sched: Remove rq->lock from the first half of ttwu()") Link: http://lkml.kernel.org/r/20150929124509.GG3816@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar [bwh: Backported to 3.2: - Adjust filename - As smp_store_release() is not defined, use smp_mb()] Signed-off-by: Ben Hutchings commit 6f4f891beccece1177de960371eef2088e05b26f Author: Takashi Iwai Date: Mon Oct 5 16:55:09 2015 +0200 ALSA: synth: Fix conflicting OSS device registration on AWE32 commit 225db5762dc1a35b26850477ffa06e5cd0097243 upstream. When OSS emulation is loaded on ISA SB AWE32 chip, we get now kernel warnings like: WARNING: CPU: 0 PID: 2791 at fs/sysfs/dir.c:31 sysfs_warn_dup+0x51/0x80() sysfs: cannot create duplicate filename '/devices/isa/sbawe.0/sound/card0/seq-oss-0-0' It's because both emux synth and opl3 drivers try to register their OSS device object with the same static index number 0. This hasn't been a big problem until the recent rewrite of device management code (that exposes sysfs at the same time), but it's been an obvious bug. This patch works around it just by using a different index number of emux synth object. There can be a more elegant way to fix, but it's enough for now, as this code won't be touched so often, in anyway. Reported-and-tested-by: Michael Shell Signed-off-by: Takashi Iwai Signed-off-by: Ben Hutchings commit 45a7943d9fed337807ce04654d1a6bbf3ae746e0 Author: Johannes Berg Date: Tue Sep 15 14:36:09 2015 +0200 iwlwifi: dvm: fix D3 firmware PN programming commit 5bd166872d8f99f156fac191299d24f828bb2348 upstream. The code to send the RX PN data (for each TID) to the firmware has a devastating bug: it overwrites the data for TID 0 with all the TID data, leaving the remaining TIDs zeroed. This will allow replays to actually be accepted by the firmware, which could allow waking up the system. Signed-off-by: Johannes Berg Signed-off-by: Luca Coelho [bwh: Backported to 3.2: adjust filename] Signed-off-by: Ben Hutchings commit a2a46f5816a863b33e79e1fecb74e06422350cfd Author: Guillaume Nault Date: Wed Sep 30 11:45:33 2015 +0200 ppp: don't override sk->sk_state in pppoe_flush_dev() commit e6740165b8f7f06d8caee0fceab3fb9d790a6fed upstream. Since commit 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release"), pppoe_release() calls dev_put(po->pppoe_dev) if sk is in the PPPOX_ZOMBIE state. But pppoe_flush_dev() can set sk->sk_state to PPPOX_ZOMBIE _and_ reset po->pppoe_dev to NULL. This leads to the following oops: [ 570.140800] BUG: unable to handle kernel NULL pointer dereference at 00000000000004e0 [ 570.142931] IP: [] pppoe_release+0x50/0x101 [pppoe] [ 570.144601] PGD 3d119067 PUD 3dbc1067 PMD 0 [ 570.144601] Oops: 0000 [#1] SMP [ 570.144601] Modules linked in: l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc loop crc32c_intel ghash_clmulni_intel jitterentropy_rng sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 ablk_helper cryptd lrw gf128mul glue_helper acpi_cpufreq evdev serio_raw processor button ext4 crc16 mbcache jbd2 virtio_net virtio_blk virtio_pci virtio_ring virtio [ 570.144601] CPU: 1 PID: 15738 Comm: ppp-apitest Not tainted 4.2.0 #1 [ 570.144601] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Debian-1.8.2-1 04/01/2014 [ 570.144601] task: ffff88003d30d600 ti: ffff880036b60000 task.ti: ffff880036b60000 [ 570.144601] RIP: 0010:[] [] pppoe_release+0x50/0x101 [pppoe] [ 570.144601] RSP: 0018:ffff880036b63e08 EFLAGS: 00010202 [ 570.144601] RAX: 0000000000000000 RBX: ffff880034340000 RCX: 0000000000000206 [ 570.144601] RDX: 0000000000000006 RSI: ffff88003d30dd20 RDI: ffff88003d30dd20 [ 570.144601] RBP: ffff880036b63e28 R08: 0000000000000001 R09: 0000000000000000 [ 570.144601] R10: 00007ffee9b50420 R11: ffff880034340078 R12: ffff8800387ec780 [ 570.144601] R13: ffff8800387ec7b0 R14: ffff88003e222aa0 R15: ffff8800387ec7b0 [ 570.144601] FS: 00007f5672f48700(0000) GS:ffff88003fc80000(0000) knlGS:0000000000000000 [ 570.144601] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 570.144601] CR2: 00000000000004e0 CR3: 0000000037f7e000 CR4: 00000000000406a0 [ 570.144601] Stack: [ 570.144601] ffffffffa018f240 ffff8800387ec780 ffffffffa018f240 ffff8800387ec7b0 [ 570.144601] ffff880036b63e48 ffffffff812caabe ffff880039e4e000 0000000000000008 [ 570.144601] ffff880036b63e58 ffffffff812cabad ffff880036b63ea8 ffffffff811347f5 [ 570.144601] Call Trace: [ 570.144601] [] sock_release+0x1a/0x75 [ 570.144601] [] sock_close+0xd/0x11 [ 570.144601] [] __fput+0xff/0x1a5 [ 570.144601] [] ____fput+0x9/0xb [ 570.144601] [] task_work_run+0x66/0x90 [ 570.144601] [] prepare_exit_to_usermode+0x8c/0xa7 [ 570.144601] [] syscall_return_slowpath+0x16d/0x19b [ 570.144601] [] int_ret_from_sys_call+0x25/0x9f [ 570.144601] Code: 48 8b 83 c8 01 00 00 a8 01 74 12 48 89 df e8 8b 27 14 e1 b8 f7 ff ff ff e9 b7 00 00 00 8a 43 12 a8 0b 74 1c 48 8b 83 a8 04 00 00 <48> 8b 80 e0 04 00 00 65 ff 08 48 c7 83 a8 04 00 00 00 00 00 00 [ 570.144601] RIP [] pppoe_release+0x50/0x101 [pppoe] [ 570.144601] RSP [ 570.144601] CR2: 00000000000004e0 [ 570.200518] ---[ end trace 46956baf17349563 ]--- pppoe_flush_dev() has no reason to override sk->sk_state with PPPOX_ZOMBIE. pppox_unbind_sock() already sets sk->sk_state to PPPOX_DEAD, which is the correct state given that sk is unbound and po->pppoe_dev is NULL. Fixes: 2b018d57ff18 ("pppoe: drop PPPOX_ZOMBIEs in pppoe_release") Tested-by: Oleksii Berezhniak Signed-off-by: Guillaume Nault Signed-off-by: David S. Miller Signed-off-by: Ben Hutchings commit db0054fe613b21cd65c1609680cc47c520760044 Author: Jann Horn Date: Sun Oct 4 19:29:12 2015 +0200 drivers/tty: require read access for controlling terminal commit 0c55627167870255158db1cde0d28366f91c8872 upstream. This is mostly a hardening fix, given that write-only access to other users' ttys is usually only given through setgid tty executables. Signed-off-by: Jann Horn Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.2: - __proc_set_tty() also takes a task_struct pointer] Signed-off-by: Ben Hutchings commit 80910ccdd3ee35e4131df38bc73b86ee60abdf0b Author: Kosuke Tatsukawa Date: Fri Oct 2 08:27:05 2015 +0000 tty: fix stall caused by missing memory barrier in drivers/tty/n_tty.c commit e81107d4c6bd098878af9796b24edc8d4a9524fd upstream. My colleague ran into a program stall on a x86_64 server, where n_tty_read() was waiting for data even if there was data in the buffer in the pty. kernel stack for the stuck process looks like below. #0 [ffff88303d107b58] __schedule at ffffffff815c4b20 #1 [ffff88303d107bd0] schedule at ffffffff815c513e #2 [ffff88303d107bf0] schedule_timeout at ffffffff815c7818 #3 [ffff88303d107ca0] wait_woken at ffffffff81096bd2 #4 [ffff88303d107ce0] n_tty_read at ffffffff8136fa23 #5 [ffff88303d107dd0] tty_read at ffffffff81368013 #6 [ffff88303d107e20] __vfs_read at ffffffff811a3704 #7 [ffff88303d107ec0] vfs_read at ffffffff811a3a57 #8 [ffff88303d107f00] sys_read at ffffffff811a4306 #9 [ffff88303d107f50] entry_SYSCALL_64_fastpath at ffffffff815c86d7 There seems to be two problems causing this issue. First, in drivers/tty/n_tty.c, __receive_buf() stores the data and updates ldata->commit_head using smp_store_release() and then checks the wait queue using waitqueue_active(). However, since there is no memory barrier, __receive_buf() could return without calling wake_up_interactive_poll(), and at the same time, n_tty_read() could start to wait in wait_woken() as in the following chart. __receive_buf() n_tty_read() ------------------------------------------------------------------------ if (waitqueue_active(&tty->read_wait)) /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ add_wait_queue(&tty->read_wait, &wait); ... if (!input_available_p(tty, 0)) { smp_store_release(&ldata->commit_head, ldata->read_head); ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ The second problem is that n_tty_read() also lacks a memory barrier call and could also cause __receive_buf() to return without calling wake_up_interactive_poll(), and n_tty_read() to wait in wait_woken() as in the chart below. __receive_buf() n_tty_read() ------------------------------------------------------------------------ spin_lock_irqsave(&q->lock, flags); /* from add_wait_queue() */ ... if (!input_available_p(tty, 0)) { /* Memory operations issued after the RELEASE may be completed before the RELEASE operation has completed */ smp_store_release(&ldata->commit_head, ldata->read_head); if (waitqueue_active(&tty->read_wait)) __add_wait_queue(q, wait); spin_unlock_irqrestore(&q->lock,flags); /* from add_wait_queue() */ ... timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout); ------------------------------------------------------------------------ There are also other places in drivers/tty/n_tty.c which have similar calls to waitqueue_active(), so instead of adding many memory barrier calls, this patch simply removes the call to waitqueue_active(), leaving just wake_up*() behind. This fixes both problems because, even though the memory access before or after the spinlocks in both wake_up*() and add_wait_queue() can sneak into the critical section, it cannot go past it and the critical section assures that they will be serialized (please see "INTER-CPU ACQUIRING BARRIER EFFECTS" in Documentation/memory-barriers.txt for a better explanation). Moreover, the resulting code is much simpler. Latency measurement using a ping-pong test over a pty doesn't show any visible performance drop. Signed-off-by: Kosuke Tatsukawa Signed-off-by: Greg Kroah-Hartman [bwh: Backported to 3.2: - Use wake_up_interruptible(), not wake_up_interruptible_poll() - There are only two spurious uses of waitqueue_active() to remove] Signed-off-by: Ben Hutchings commit 32b95ad7b61a82e610e4a72eefba043aaa4f6ce8 Author: Vincent Palatin Date: Thu Oct 1 14:10:22 2015 -0700 usb: Add device quirk for Logitech PTZ cameras commit 72194739f54607bbf8cfded159627a2015381557 upstream. Add a device quirk for the Logitech PTZ Pro Camera and its sibling the ConferenceCam CC3000e Camera. This fixes the failed camera enumeration on some boot, particularly on machines with fast CPU. Tested by connecting a Logitech PTZ Pro Camera to a machine with a Haswell Core i7-4600U CPU @ 2.10GHz, and doing thousands of reboot cycles while recording the kernel logs and taking camera picture after each boot. Before the patch, more than 7% of the boots show some enumeration transfer failures and in a few of them, the kernel is giving up before actually enumerating the webcam. After the patch, the enumeration has been correct on every reboot. Signed-off-by: Vincent Palatin Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit e365a50eb15891e2f1de00890e5cad463a79cdd3 Author: Yao-Wen Mao Date: Mon Aug 31 14:24:09 2015 +0800 USB: Add reset-resume quirk for two Plantronics usb headphones. commit 8484bf2981b3d006426ac052a3642c9ce1d8d980 upstream. These two headphones need a reset-resume quirk to properly resume to original volume level. Signed-off-by: Yao-Wen Mao Signed-off-by: Greg Kroah-Hartman Signed-off-by: Ben Hutchings commit 9d2d631ce9e417f4ce85bd3951bb7af8f1203ba9 Author: Dan Carpenter Date: Sat Aug 8 22:16:42 2015 +0300 iio: accel: sca3000: memory corruption in sca3000_read_first_n_hw_rb() commit eda7d0f38aaf50dbb2a2de15e8db386c4f6f65fc upstream. "num_read" is in byte units but we are write u16s so we end up write twice as much as intended. Signed-off-by: Dan Carpenter Signed-off-by: Jonathan Cameron [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit 3f9808099be442aaca2b9ba5c7ef7c67c6382c23 Author: John Stultz Date: Mon Sep 14 18:05:20 2015 -0700 clocksource: Fix abs() usage w/ 64bit values commit 67dfae0cd72fec5cd158b6e5fb1647b7dbe0834c upstream. This patch fixes one cases where abs() was being used with 64-bit nanosecond values, where the result may be capped at 32-bits. This potentially could cause watchdog false negatives on 32-bit systems, so this patch addresses the issue by using abs64(). Signed-off-by: John Stultz Cc: Prarit Bhargava Cc: Richard Cochran Cc: Ingo Molnar Link: http://lkml.kernel.org/r/1442279124-7309-2-git-send-email-john.stultz@linaro.org Signed-off-by: Thomas Gleixner [bwh: Backported to 3.2: adjust context] Signed-off-by: Ben Hutchings commit e1263d46df6f5f96bc6abd039684b7778a72bab3 Author: NeilBrown Date: Thu Sep 24 15:47:47 2015 +1000 md/raid0: apply base queue limits *before* disk_stack_limits commit 66eefe5de11db1e0d8f2edc3880d50e7c36a9d43 upstream. Calling e.g. blk_queue_max_hw_sectors() after calls to disk_stack_limits() discards the settings determined by disk_stack_limits(). So we need to make those calls first. Fixes: 199dc6ed5179 ("md/raid0: update queue parameter in a safer location.") Reported-by: Jes Sorensen Signed-off-by: NeilBrown [bwh: Backported to 3.2: the code being moved looks a little different] Signed-off-by: Ben Hutchings commit 1afa0468d0a71d798b24fc3a81f803aba491d22e Author: NeilBrown Date: Mon Aug 3 13:11:47 2015 +1000 md/raid0: update queue parameter in a safer location. commit 199dc6ed5179251fa6158a461499c24bdd99c836 upstream. When a (e.g.) RAID5 array is reshaped to RAID0, the updating of queue parameters (e.g. max number of sectors per bio) is done in the wrong place. It should be part of ->run, but it is actually part of ->takeover. This means it happens before level_store() calls: blk_set_stacking_limits(&mddev->queue->limits); and so it ineffective. This can lead to errors from underlying devices. So move all the relevant settings out of create_stripe_zones() and into raid0_run(). As this can lead to a bug-on it is suitable for any -stable kernel which supports reshape to RAID0. So 2.6.35 or later. As the bug has been present for five years there is no urgency, so no need to rush into -stable. Fixes: 9af204cf720c ("md: Add support for Raid5->Raid0 and Raid10->Raid0 takeover") Reported-by: Yi Zhang Signed-off-by: NeilBrown [bwh: Backported to 3.2: - md has no discard or write-same support - md is not used by dm-raid so mddev->queue is never null - Open-code rdev_for_each() - Adjust context] Signed-off-by: Ben Hutchings commit c62cf38b9cfadcf548f07c2d0d9fe4b8aa1c2387 Author: Steve French Date: Mon Sep 28 17:21:07 2015 -0500 Do not fall back to SMBWriteX in set_file_size error cases commit 646200a041203f440fb6fcf9cacd9efeda9de74c upstream. The error paths in set_file_size for cifs and smb3 are incorrect. In the unlikely event that a server did not support set file info of the file size, the code incorrectly falls back to trying SMBWriteX (note that only the original core SMB Write, used for example by DOS, can set the file size this way - this actually does not work for the more recent SMBWriteX). The idea was since the old DOS SMB Write could set the file size if you write zero bytes at that offset then use that if server rejects the normal set file info call. Fortunately the SMBWriteX will never be sent on the wire (except when file size is zero) since the length and offset fields were reversed in the two places in this function that call SMBWriteX causing the fall back path to return an error. It is also important to never call an SMB request from an SMB2/sMB3 session (which theoretically would be possible, and can cause a brief session drop, although the client recovers) so this should be fixed. In practice this path does not happen with modern servers but the error fall back to SMBWriteX is clearly wrong. Removing the calls to SMBWriteX in the error paths in cifs_set_file_size Pointed out by PaX/grsecurity team Signed-off-by: Steve French Reported-by: PaX Team CC: Emese Revfy CC: Brad Spengler [bwh: Backported to 3.2: deleted code looks slightly different] Signed-off-by: Ben Hutchings commit 846bc2d8bef1e0d253cdfabfe707f37fc8cd836d Author: Mel Gorman Date: Thu Oct 1 15:36:57 2015 -0700 mm: hugetlbfs: skip shared VMAs when unmapping private pages to satisfy a fault commit 2f84a8990ebbe235c59716896e017c6b2ca1200f upstream. SunDong reported the following on https://bugzilla.kernel.org/show_bug.cgi?id=103841 I think I find a linux bug, I have the test cases is constructed. I can stable recurring problems in fedora22(4.0.4) kernel version, arch for x86_64. I construct transparent huge page, when the parent and child process with MAP_SHARE, MAP_PRIVATE way to access the same huge page area, it has the opportunity to lead to huge page copy on write failure, and then it will munmap the child corresponding mmap area, but then the child mmap area with VM_MAYSHARE attributes, child process munmap this area can trigger VM_BUG_ON in set_vma_resv_flags functions (vma - > vm_flags & VM_MAYSHARE). There were a number of problems with the report (e.g. it's hugetlbfs that triggers this, not transparent huge pages) but it was fundamentally correct in that a VM_BUG_ON in set_vma_resv_flags() can be triggered that looks like this vma ffff8804651fd0d0 start 00007fc474e00000 end 00007fc475e00000 next ffff8804651fd018 prev ffff8804651fd188 mm ffff88046b1b1800 prot 8000000000000027 anon_vma (null) vm_ops ffffffff8182a7a0 pgoff 0 file ffff88106bdb9800 private_data (null) flags: 0x84400fb(read|write|shared|mayread|maywrite|mayexec|mayshare|dontexpand|hugetlb) ------------ kernel BUG at mm/hugetlb.c:462! SMP Modules linked in: xt_pkttype xt_LOG xt_limit [..] CPU: 38 PID: 26839 Comm: map Not tainted 4.0.4-default #1 Hardware name: Dell Inc. PowerEdge R810/0TT6JF, BIOS 2.7.4 04/26/2012 set_vma_resv_flags+0x2d/0x30 The VM_BUG_ON is correct because private and shared mappings have different reservation accounting but the warning clearly shows that the VMA is shared. When a private COW fails to allocate a new page then only the process that created the VMA gets the page -- all the children unmap the page. If the children access that data in the future then they get killed. The problem is that the same file is mapped shared and private. During the COW, the allocation fails, the VMAs are traversed to unmap the other private pages but a shared VMA is found and the bug is triggered. This patch identifies such VMAs and skips them. Signed-off-by: Mel Gorman Reported-by: SunDong Reviewed-by: Michal Hocko Cc: Andrea Arcangeli Cc: Hugh Dickins Cc: Naoya Horiguchi Cc: David Rientjes Reviewed-by: Naoya Horiguchi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Ben Hutchings commit bde3a53c6f76c4b3ee2635336ebf0c018607eba7 Author: Ben Hutchings Date: Sat Sep 26 12:23:56 2015 +0100 genirq: Fix race in register_irq_proc() commit 95c2b17534654829db428f11bcf4297c059a2a7e upstream. Per-IRQ directories in procfs are created only when a handler is first added to the irqdesc, not when the irqdesc is created. In the case of a shared IRQ, multiple tasks can race to create a directory. This race condition seems to have been present forever, but is easier to hit with async probing. Signed-off-by: Ben Hutchings Link: http://lkml.kernel.org/r/1443266636.2004.2.camel@decadent.org.uk Signed-off-by: Thomas Gleixner commit 5311d93d0d33ae878d5fbb35ea5693b9c813ba04 Author: Thomas Gleixner Date: Wed Sep 30 08:38:22 2015 +0000 x86/process: Add proper bound checks in 64bit get_wchan() commit eddd3826a1a0190e5235703d1e666affa4d13b96 upstream. Dmitry Vyukov reported the following using trinity and the memory error detector AddressSanitizer (https://code.google.com/p/address-sanitizer/wiki/AddressSanitizerForKernel). [ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on address ffff88002e280000 [ 124.576801] ffff88002e280000 is located 131938492886538 bytes to the left of 28857600-byte region [ffffffff81282e0a, ffffffff82e0830a) [ 124.578633] Accessed by thread T10915: [ 124.579295] inlined in describe_heap_address ./arch/x86/mm/asan/report.c:164 [ 124.579295] #0 ffffffff810dd277 in asan_report_error ./arch/x86/mm/asan/report.c:278 [ 124.580137] #1 ffffffff810dc6a0 in asan_check_region ./arch/x86/mm/asan/asan.c:37 [ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0 [ 124.581893] #3 ffffffff8107c093 in get_wchan ./arch/x86/kernel/process_64.c:444 The address checks in the 64bit implementation of get_wchan() are wrong in several ways: - The lower bound of the stack is not the start of the stack page. It's the start of the stack page plus sizeof (struct thread_info) - The upper bound must be: top_of_stack - TOP_OF_KERNEL_STACK_PADDING - 2 * sizeof(unsigned long). The 2 * sizeof(unsigned long) is required because the stack pointer points at the frame pointer. The layout on the stack is: ... IP FP ... IP FP. So we need to make sure that both IP and FP are in the bounds. Fix the bound checks and get rid of the mix of numeric constants, u64 and unsigned long. Making all unsigned long allows us to use the same function for 32bit as well. Use READ_ONCE() when accessing the stack. This does not prevent a concurrent wakeup of the task and the stack changing, but at least it avoids TOCTOU. Also check task state at the end of the loop. Again that does not prevent concurrent changes, but it avoids walking for nothing. Add proper comments while at it. Reported-by: Dmitry Vyukov Reported-by: Sasha Levin Based-on-patch-from: Wolfram Gloger Signed-off-by: Thomas Gleixner Reviewed-by: Borislav Petkov Reviewed-by: Dmitry Vyukov Cc: Andrey Ryabinin Cc: Andy Lutomirski Cc: Andrey Konovalov Cc: Kostya Serebryany Cc: Alexander Potapenko Cc: kasan-dev Cc: Denys Vlasenko Cc: Andi Kleen Cc: Wolfram Gloger Link: http://lkml.kernel.org/r/20150930083302.694788319@linutronix.de Signed-off-by: Thomas Gleixner [bwh: Backported to 3.2: - s/READ_ONCE/ACCESS_ONCE/ - Remove use of TOP_OF_KERNEL_STACK_PADDING, not defined here and would be defined as 0] Signed-off-by: Ben Hutchings commit b9f15ae6d4b2f46f07be713f5910f0185f267601 Author: James Hogan Date: Fri Mar 27 08:33:43 2015 +0000 MIPS: dma-default: Fix 32-bit fall back to GFP_DMA commit 53960059d56ecef67d4ddd546731623641a3d2d1 upstream. If there is a DMA zone (usually 24bit = 16MB I believe), but no DMA32 zone, as is the case for some 32-bit kernels, then massage_gfp_flags() will cause DMA memory allocated for devices with a 32..63-bit coherent_dma_mask to fall back to using __GFP_DMA, even though there may only be 32-bits of physical address available anyway. Correct that case to compare against a mask the size of phys_addr_t instead of always using a 64-bit mask. Signed-off-by: James Hogan Fixes: a2e715a86c6d ("MIPS: DMA: Fix computation of DMA flags from device's coherent_dma_mask.") Cc: Ralf Baechle Cc: linux-mips@linux-mips.org Patchwork: https://patchwork.linux-mips.org/patch/9610/ Signed-off-by: Ralf Baechle Signed-off-by: Ben Hutchings commit 02eb3b901e81417a8a0c9f2338fa24d8a7d00093 Author: shengyong Date: Mon Sep 28 17:57:19 2015 +0000 UBI: return ENOSPC if no enough space available commit 7c7feb2ebfc9c0552c51f0c050db1d1a004faac5 upstream. UBI: attaching mtd1 to ubi0 UBI: scanning is finished UBI error: init_volumes: not enough PEBs, required 706, available 686 UBI error: ubi_wl_init: no enough physical eraseblocks (-20, need 1) UBI error: ubi_attach_mtd_dev: failed to attach mtd1, error -12 <= NOT ENOMEM UBI error: ubi_init: cannot attach mtd1 If available PEBs are not enough when initializing volumes, return -ENOSPC directly. If available PEBs are not enough when initializing WL, return -ENOSPC instead of -ENOMEM. Signed-off-by: Sheng Yong Signed-off-by: Richard Weinberger Reviewed-by: David Gstir Signed-off-by: Ben Hutchings commit 73d95e7b04162376761b6c942d74cd18b1556dce Author: Richard Weinberger Date: Tue Sep 22 23:58:07 2015 +0200 UBI: Validate data_size commit 281fda27673f833a01d516658a64d22a32c8e072 upstream. Make sure that data_size is less than LEB size. Otherwise a handcrafted UBI image is able to trigger an out of bounds memory access in ubi_compare_lebs(). Signed-off-by: Richard Weinberger Reviewed-by: David Gstir [bwh: Backported to 3.2: drop first argument to ubi_err()] Signed-off-by: Ben Hutchings commit a92a5446e2482938e1bce89789235cf814cd85d1 Author: Malcolm Crossley Date: Mon Sep 28 11:36:52 2015 +0100 x86/xen: Do not clip xen_e820_map to xen_e820_map_entries when sanitizing map commit 64c98e7f49100b637cd20a6c63508caed6bbba7a upstream. Sanitizing the e820 map may produce extra E820 entries which would result in the topmost E820 entries being removed. The removed entries would typically include the top E820 usable RAM region and thus result in the domain having signicantly less RAM available to it. Fix by allowing sanitize_e820_map to use the full size of the allocated E820 array. Signed-off-by: Malcolm Crossley Reviewed-by: Boris Ostrovsky Signed-off-by: David Vrabel [bwh: Backported to 3.2: s/xen_e820_map_entries/memmap.nr_entries/; s/xen_e820_map/map/g] Signed-off-by: Ben Hutchings commit d599a135f9562d30b74fefc3315d9490981cfd3f Author: Andreas Schwab Date: Wed Sep 23 23:12:09 2015 +0200 m68k: Define asmlinkage_protect commit 8474ba74193d302e8340dddd1e16c85cc4b98caf upstream. Make sure the compiler does not modify arguments of syscall functions. This can happen if the compiler generates a tailcall to another function. For example, without asmlinkage_protect sys_openat is compiled into this function: sys_openat: clr.l %d0 move.w 18(%sp),%d0 move.l %d0,16(%sp) jbra do_sys_open Note how the fourth argument is modified in place, modifying the register %d4 that gets restored from this stack slot when the function returns to user-space. The caller may expect the register to be unmodified across system calls. Signed-off-by: Andreas Schwab Signed-off-by: Geert Uytterhoeven Signed-off-by: Ben Hutchings commit 01352209ff46c75e1c5705f02502a393ce29b992 Author: Felix Fietkau Date: Thu Sep 24 16:59:46 2015 +0200 ath9k: declare required extra tx headroom commit 029cd0370241641eb70235d205aa0b90c84dce44 upstream. ath9k inserts padding between the 802.11 header and the data area (to align it). Since it didn't declare this extra required headroom, this led to some nasty issues like randomly dropped packets in some setups. Signed-off-by: Felix Fietkau Signed-off-by: Kalle Valo Signed-off-by: Ben Hutchings commit 6767c520c2a3dab0f1291eece8fe8dc4eb107220 Author: Mark Brown Date: Sat Sep 19 07:12:34 2015 -0700 regmap: debugfs: Don't bother actually printing when calculating max length commit 176fc2d5770a0990eebff903ba680d2edd32e718 upstream. The in kernel snprintf() will conveniently return the actual length of the printed string even if not given an output beffer at all so just do that rather than relying on the user to pass in a suitable buffer, ensuring that we don't need to worry if the buffer was truncated due to the size of the buffer passed in. Reported-by: Rasmus Villemoes Signed-off-by: Mark Brown Signed-off-by: Ben Hutchings commit 9bd8029559877940f3f3b436f635794120ea3b2b Author: Mark Brown Date: Sat Sep 19 07:00:18 2015 -0700 regmap: debugfs: Ensure we don't underflow when printing access masks commit b763ec17ac762470eec5be8ebcc43e4f8b2c2b82 upstream. If a read is attempted which is smaller than the line length then we may underflow the subtraction we're doing with the unsigned size_t type so move some of the calculation to be additions on the right hand side instead in order to avoid this. Reported-by: Rasmus Villemoes Signed-off-by: Mark Brown Signed-off-by: Ben Hutchings commit 3895ff2d13043ecd091813b67a485ec487870b63 Author: Peter Zijlstra Date: Thu Aug 20 10:34:59 2015 +0930 module: Fix locking in symbol_put_addr() commit 275d7d44d802ef271a42dc87ac091a495ba72fc5 upstream. Poma (on the way to another bug) reported an assertion triggering: [] module_assert_mutex_or_preempt+0x49/0x90 [] __module_address+0x32/0x150 [] __module_text_address+0x16/0x70 [] symbol_put_addr+0x29/0x40 [] dvb_frontend_detach+0x7d/0x90 [dvb_core] Laura Abbott produced a patch which lead us to inspect symbol_put_addr(). This function has a comment claiming it doesn't need to disable preemption around the module lookup because it holds a reference to the module it wants to find, which therefore cannot go away. This is wrong (and a false optimization too, preempt_disable() is really rather cheap, and I doubt any of this is on uber critical paths, otherwise it would've retained a pointer to the actual module anyway and avoided the second lookup). While its true that the module cannot go away while we hold a reference on it, the data structure we do the lookup in very much _CAN_ change while we do the lookup. Therefore fix the comment and add the required preempt_disable(). Reported-by: poma Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Rusty Russell Fixes: a6e6abd575fc ("module: remove module_text_address()") Signed-off-by: Ben Hutchings commit 005f90fa5c6bf0b7d0f08df1b0b712e7e8d92b6f Author: Ben Hutchings Date: Thu Oct 15 01:20:29 2015 +0100 Revert "KVM: MMU: fix validation of mmio page fault" This reverts commit 41e3025eacd6daafc40c3e7850fbcabc8b847805, which was commit 6f691251c0350ac52a007c54bf3ef62e9d8cdc5e upstream. The fix is only needed after commit f8f559422b6c ("KVM: MMU: fast invalidate all mmio sptes"), included in Linux 3.11. Signed-off-by: Ben Hutchings