commit 3b41b7e302241aa7c1396802691182bec41ea2f5 Author: Greg Kroah-Hartman Date: Wed May 18 18:35:34 2016 -0700 Linux 4.5.5 commit 7ef374ef659b976339f01cfb0b06006388f58b2c Author: Linus Torvalds Date: Sat May 14 11:11:44 2016 -0700 nf_conntrack: avoid kernel pointer value leak in slab name commit 31b0b385f69d8d5491a4bca288e25e63f1d945d0 upstream. The slab name ends up being visible in the directory structure under /sys, and even if you don't have access rights to the file you can see the filenames. Just use a 64-bit counter instead of the pointer to the 'net' structure to generate a unique name. This code will go away in 4.7 when the conntrack code moves to a single kmemcache, but this is the backportable simple solution to avoiding leaking kernel pointers to user space. Fixes: 5b3501faa874 ("netfilter: nf_conntrack: per netns nf_conntrack_cachep") Signed-off-by: Linus Torvalds Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9abb19bd9efe6999877edc10ffef3d34f261c767 Author: Yauhen Kharuzhy Date: Tue Mar 29 14:17:48 2016 -0700 btrfs: Reset IO error counters before start of device replacing commit 7ccefb98ce3e5c4493cd213cd03714b7149cf0cb upstream. If device replace entry was found on disk at mounting and its num_write_errors stats counter has non-NULL value, then replace operation will never be finished and -EIO error will be reported by btrfs_scrub_dev() because this counter is never reset. # mount -o degraded /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/ # btrfs replace status /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/ Started on 25.Mar 07:28:00, canceled on 25.Mar 07:28:01 at 0.0%, 40 write errs, 0 uncorr. read errs # btrfs replace start -B 4 /dev/sdg /media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/ ERROR: ioctl(DEV_REPLACE_START) failed on "/media/a4fb5c0a-21c5-4fe7-8d0e-fdd87d5f71ee/": Input/output error, no error Reset num_write_errors and num_uncorrectable_read_errors counters in the dev_replace structure before start of replacing. Signed-off-by: Yauhen Kharuzhy Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit f188c432c9db3be8a705f6825c5a9d9f837ed215 Author: Josef Bacik Date: Fri Mar 25 10:02:41 2016 -0400 Btrfs: don't use src fd for printk commit c79b4713304f812d3d6c95826fc3e5fc2c0b0c14 upstream. The fd we pass in may not be on a btrfs file system, so don't try to do BTRFS_I() on it. Thanks, Signed-off-by: Josef Bacik Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 440323f92e3542110978dcbca890aad8b3798c34 Author: David Sterba Date: Wed Mar 30 16:01:12 2016 +0200 btrfs: fallback to vmalloc in btrfs_compare_tree commit 8f282f71eaee7ac979cdbe525f76daa0722798a8 upstream. The allocation of node could fail if the memory is too fragmented for a given node size, practically observed with 64k. http://article.gmane.org/gmane.comp.file-systems.btrfs/54689 Reported-and-tested-by: Jean-Denis Girard Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 52ec554945bb69c7bd9790ce11338566b25ba293 Author: Mark Fasheh Date: Wed Mar 30 17:57:48 2016 -0700 btrfs: handle non-fatal errors in btrfs_qgroup_inherit() commit 918c2ee103cf9956f1c61d3f848dbb49fd2d104a upstream. create_pending_snapshot() will go readonly on _any_ error return from btrfs_qgroup_inherit(). If qgroups are enabled, a user can crash their fs by just making a snapshot and asking it to inherit from an invalid qgroup. For example: $ btrfs sub snap -i 1/10 /btrfs/ /btrfs/foo Will cause a transaction abort. Fix this by only throwing errors in btrfs_qgroup_inherit() when we know going readonly is acceptable. The following xfstests test case reproduces this bug: seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" here=`pwd` tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { cd / rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter # remove previous $seqres.full before test rm -f $seqres.full # real QA test starts here _supported_fs btrfs _supported_os Linux _require_scratch rm -f $seqres.full _scratch_mkfs _scratch_mount _run_btrfs_util_prog quota enable $SCRATCH_MNT # The qgroup '1/10' does not exist and should be silently ignored _run_btrfs_util_prog subvolume snapshot -i 1/10 $SCRATCH_MNT $SCRATCH_MNT/snap1 _scratch_unmount echo "Silence is golden" status=0 exit Signed-off-by: Mark Fasheh Reviewed-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 24469ab4f3b7dd6fef36787bad42b2f75053556e Author: Liu Bo Date: Mon Mar 21 14:59:53 2016 -0700 Btrfs: fix invalid reference in replace_path commit 264813acb1c756aebc337b16b832604a0c9aadaf upstream. Dan Carpenter's static checker has found this error, it's introduced by commit 64c043de466d ("Btrfs: fix up read_tree_block to return proper error") It's really supposed to 'break' the loop on error like others. Cc: Dan Carpenter Reported-by: Dan Carpenter Signed-off-by: Liu Bo Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit b96513cce7b82c66f6d1ff87261d65d7cda08861 Author: Alex Lyakas Date: Thu Mar 10 13:10:15 2016 +0200 btrfs: do not write corrupted metadata blocks to disk commit 0f805531daa2ebfb5706422dc2ead1cff9e53e65 upstream. csum_dirty_buffer was issuing a warning in case the extent buffer did not look alright, but was still returning success. Let's return error in this case, and also add an additional sanity check on the extent buffer header. The caller up the chain may BUG_ON on this, for example flush_epd_write_bio will, but it is better than to have a silent metadata corruption on disk. Signed-off-by: Alex Lyakas Reviewed-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 342da5cefddbf818e1cb59537e021cdad9744e93 Author: Alex Lyakas Date: Thu Mar 10 13:09:46 2016 +0200 btrfs: csum_tree_block: return proper errno value commit 8bd98f0e6bf792e8fa7c3fed709321ad42ba8d2e upstream. Signed-off-by: Alex Lyakas Reviewed-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 13b7e683b67b36a7dc3a5bc98e3718028b3c5dfd Author: Filipe Manana Date: Thu Feb 25 23:19:38 2016 +0000 Btrfs: do not collect ordered extents when logging that inode exists commit 5e33a2bd7ca7fa687fb0965869196eea6815d1f3 upstream. When logging that an inode exists, for example as part of a directory fsync operation, we were collecting any ordered extents for the inode but we ended up doing nothing with them except tagging them as processed, by setting the flag BTRFS_ORDERED_LOGGED on them, which prevented a subsequent fsync of that inode (using the LOG_INODE_ALL mode) from collecting and processing them. This created a time window where a second fsync against the inode, using the fast path, ended up not logging the checksums for the new extents but it logged the extents since they were part of the list of modified extents. This happened because the ordered extents were not collected and checksums were not yet added to the csum tree - the ordered extents have not gone through btrfs_finish_ordered_io() yet (which is where we add them to the csum tree by calling inode.c:add_pending_csums()). So fix this by not collecting an inode's ordered extents if we are logging it with the LOG_INODE_EXISTS mode. Signed-off-by: Filipe Manana Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit ce91def62e14b955a3831af9123c60fefb68d421 Author: Filipe Manana Date: Wed Feb 24 07:35:05 2016 +0000 Btrfs: fix race when checking if we can skip fsync'ing an inode commit affc0ff902d539ebe9bba405d330410314f46e9f upstream. If we're about to do a fast fsync for an inode and btrfs_inode_in_log() returns false, it's possible that we had an ordered extent in progress (btrfs_finish_ordered_io() not run yet) when we noticed that the inode's last_trans field was not greater than the id of the last committed transaction, but shortly after, before we checked if there were any ongoing ordered extents, the ordered extent had just completed and removed itself from the inode's ordered tree, in which case we end up not logging the inode, losing some data if a power failure or crash happens after the fsync handler returns and before the transaction is committed. Fix this by checking first if there are any ongoing ordered extents before comparing the inode's last_trans with the id of the last committed transaction - when it completes, an ordered extent always updates the inode's last_trans before it removes itself from the inode's ordered tree (at btrfs_finish_ordered_io()). Signed-off-by: Filipe Manana Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 8eb64c913af6f50933b6a332123d7bb8688b7244 Author: Filipe Manana Date: Thu Feb 18 14:28:55 2016 +0000 Btrfs: fix deadlock between direct IO reads and buffered writes commit ade770294df29e08f913e5d733a756893128f45e upstream. While running a test with a mix of buffered IO and direct IO against the same files I hit a deadlock reported by the following trace: [11642.140352] INFO: task kworker/u32:3:15282 blocked for more than 120 seconds. [11642.142452] Not tainted 4.4.0-rc6-btrfs-next-21+ #1 [11642.143982] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11642.146332] kworker/u32:3 D ffff880230ef7988 [11642.147737] systemd-journald[571]: Sent WATCHDOG=1 notification. [11642.149771] 0 15282 2 0x00000000 [11642.151205] Workqueue: btrfs-flush_delalloc btrfs_flush_delalloc_helper [btrfs] [11642.154074] ffff880230ef7988 0000000000000246 0000000000014ec0 ffff88023ec94ec0 [11642.156722] ffff880233fe8f80 ffff880230ef8000 ffff88023ec94ec0 7fffffffffffffff [11642.159205] 0000000000000002 ffffffff8147b7f9 ffff880230ef79a0 ffffffff8147b541 [11642.161403] Call Trace: [11642.162129] [] ? bit_wait+0x2f/0x2f [11642.163396] [] schedule+0x82/0x9a [11642.164871] [] schedule_timeout+0x43/0x109 [11642.167020] [] ? bit_wait+0x2f/0x2f [11642.167931] [] ? trace_hardirqs_on_caller+0x17b/0x197 [11642.182320] [] ? trace_hardirqs_on+0xd/0xf [11642.183762] [] ? timekeeping_get_ns+0xe/0x33 [11642.185308] [] ? ktime_get+0x41/0x52 [11642.186782] [] io_schedule_timeout+0xa0/0x102 [11642.188217] [] ? io_schedule_timeout+0xa0/0x102 [11642.189626] [] bit_wait_io+0x1b/0x39 [11642.190803] [] __wait_on_bit_lock+0x4c/0x90 [11642.192158] [] __lock_page+0x66/0x68 [11642.193379] [] ? autoremove_wake_function+0x3a/0x3a [11642.194831] [] lock_page+0x31/0x34 [btrfs] [11642.197068] [] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs] [11642.199188] [] extent_writepages+0x4b/0x5c [btrfs] [11642.200723] [] ? btrfs_writepage_start_hook+0xce/0xce [btrfs] [11642.202465] [] btrfs_writepages+0x28/0x2a [btrfs] [11642.203836] [] do_writepages+0x23/0x2c [11642.205624] [] __filemap_fdatawrite_range+0x5a/0x61 [11642.207057] [] filemap_fdatawrite_range+0x13/0x15 [11642.208529] [] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs] [11642.210375] [] ? btrfs_scrubparity_helper+0x140/0x33a [btrfs] [11642.212132] [] btrfs_run_ordered_extent_work+0x25/0x34 [btrfs] [11642.213837] [] btrfs_scrubparity_helper+0x15c/0x33a [btrfs] [11642.215457] [] btrfs_flush_delalloc_helper+0xe/0x10 [btrfs] [11642.217095] [] process_one_work+0x256/0x48b [11642.218324] [] worker_thread+0x1f5/0x2a7 [11642.219466] [] ? rescuer_thread+0x289/0x289 [11642.220801] [] kthread+0xd4/0xdc [11642.222032] [] ? kthread_parkme+0x24/0x24 [11642.223190] [] ret_from_fork+0x3f/0x70 [11642.224394] [] ? kthread_parkme+0x24/0x24 [11642.226295] 2 locks held by kworker/u32:3/15282: [11642.227273] #0: ("%s-%s""btrfs", name){++++.+}, at: [] process_one_work+0x165/0x48b [11642.229412] #1: ((&work->normal_work)){+.+.+.}, at: [] process_one_work+0x165/0x48b [11642.231414] INFO: task kworker/u32:8:15289 blocked for more than 120 seconds. [11642.232872] Not tainted 4.4.0-rc6-btrfs-next-21+ #1 [11642.234109] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11642.235776] kworker/u32:8 D ffff88020de5f848 0 15289 2 0x00000000 [11642.237412] Workqueue: writeback wb_workfn (flush-btrfs-481) [11642.238670] ffff88020de5f848 0000000000000246 0000000000014ec0 ffff88023ed54ec0 [11642.240475] ffff88021b1ece40 ffff88020de60000 ffff88023ed54ec0 7fffffffffffffff [11642.242154] 0000000000000002 ffffffff8147b7f9 ffff88020de5f860 ffffffff8147b541 [11642.243715] Call Trace: [11642.244390] [] ? bit_wait+0x2f/0x2f [11642.245432] [] schedule+0x82/0x9a [11642.246392] [] schedule_timeout+0x43/0x109 [11642.247479] [] ? bit_wait+0x2f/0x2f [11642.248551] [] ? trace_hardirqs_on_caller+0x17b/0x197 [11642.249968] [] ? trace_hardirqs_on+0xd/0xf [11642.251043] [] ? timekeeping_get_ns+0xe/0x33 [11642.252202] [] ? ktime_get+0x41/0x52 [11642.253210] [] io_schedule_timeout+0xa0/0x102 [11642.254307] [] ? io_schedule_timeout+0xa0/0x102 [11642.256118] [] bit_wait_io+0x1b/0x39 [11642.257131] [] __wait_on_bit_lock+0x4c/0x90 [11642.258200] [] __lock_page+0x66/0x68 [11642.259168] [] ? autoremove_wake_function+0x3a/0x3a [11642.260516] [] lock_page+0x31/0x34 [btrfs] [11642.261841] [] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs] [11642.263531] [] extent_writepages+0x4b/0x5c [btrfs] [11642.264747] [] ? btrfs_writepage_start_hook+0xce/0xce [btrfs] [11642.266148] [] btrfs_writepages+0x28/0x2a [btrfs] [11642.267264] [] do_writepages+0x23/0x2c [11642.268280] [] __writeback_single_inode+0xda/0x5ba [11642.269407] [] writeback_sb_inodes+0x27b/0x43d [11642.270476] [] __writeback_inodes_wb+0x76/0xae [11642.271547] [] wb_writeback+0x19e/0x41c [11642.272588] [] wb_workfn+0x201/0x341 [11642.273523] [] ? wb_workfn+0x201/0x341 [11642.274479] [] process_one_work+0x256/0x48b [11642.275497] [] worker_thread+0x1f5/0x2a7 [11642.276518] [] ? rescuer_thread+0x289/0x289 [11642.277520] [] ? rescuer_thread+0x289/0x289 [11642.278517] [] kthread+0xd4/0xdc [11642.279371] [] ? kthread_parkme+0x24/0x24 [11642.280468] [] ret_from_fork+0x3f/0x70 [11642.281607] [] ? kthread_parkme+0x24/0x24 [11642.282604] 3 locks held by kworker/u32:8/15289: [11642.283423] #0: ("writeback"){++++.+}, at: [] process_one_work+0x165/0x48b [11642.285629] #1: ((&(&wb->dwork)->work)){+.+.+.}, at: [] process_one_work+0x165/0x48b [11642.287538] #2: (&type->s_umount_key#37){+++++.}, at: [] trylock_super+0x1b/0x4b [11642.289423] INFO: task fdm-stress:26848 blocked for more than 120 seconds. [11642.290547] Not tainted 4.4.0-rc6-btrfs-next-21+ #1 [11642.291453] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11642.292864] fdm-stress D ffff88022c107c20 0 26848 26591 0x00000000 [11642.294118] ffff88022c107c20 000000038108affa 0000000000014ec0 ffff88023ed54ec0 [11642.295602] ffff88013ab1ca40 ffff88022c108000 ffff8800b2fc19d0 00000000000e0fff [11642.297098] ffff8800b2fc19b0 ffff88022c107c88 ffff88022c107c38 ffffffff8147b541 [11642.298433] Call Trace: [11642.298896] [] schedule+0x82/0x9a [11642.299738] [] lock_extent_bits+0xfe/0x1a3 [btrfs] [11642.300833] [] ? add_wait_queue_exclusive+0x44/0x44 [11642.301943] [] lock_and_cleanup_extent_if_need+0x68/0x18e [btrfs] [11642.303270] [] __btrfs_buffered_write+0x238/0x4c1 [btrfs] [11642.304552] [] ? btrfs_file_write_iter+0x17c/0x408 [btrfs] [11642.305782] [] btrfs_file_write_iter+0x2f4/0x408 [btrfs] [11642.306878] [] __vfs_write+0x7c/0xa5 [11642.307729] [] vfs_write+0x9d/0xe8 [11642.308602] [] SyS_write+0x50/0x7e [11642.309410] [] entry_SYSCALL_64_fastpath+0x12/0x6b [11642.310403] 3 locks held by fdm-stress/26848: [11642.311108] #0: (&f->f_pos_lock){+.+.+.}, at: [] __fdget_pos+0x3a/0x40 [11642.312578] #1: (sb_writers#11){.+.+.+}, at: [] __sb_start_write+0x5f/0xb0 [11642.314170] #2: (&sb->s_type->i_mutex_key#15){+.+.+.}, at: [] btrfs_file_write_iter+0x73/0x408 [btrfs] [11642.316796] INFO: task fdm-stress:26849 blocked for more than 120 seconds. [11642.317842] Not tainted 4.4.0-rc6-btrfs-next-21+ #1 [11642.318691] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11642.319959] fdm-stress D ffff8801964ffa68 0 26849 26591 0x00000000 [11642.321312] ffff8801964ffa68 00ff8801e9975f80 0000000000014ec0 ffff88023ed94ec0 [11642.322555] ffff8800b00b4840 ffff880196500000 ffff8801e9975f20 0000000000000002 [11642.323715] ffff8801e9975f18 ffff8800b00b4840 ffff8801964ffa80 ffffffff8147b541 [11642.325096] Call Trace: [11642.325532] [] schedule+0x82/0x9a [11642.326303] [] schedule_timeout+0x43/0x109 [11642.327180] [] ? mark_held_locks+0x5e/0x74 [11642.328114] [] ? _raw_spin_unlock_irq+0x2c/0x4a [11642.329051] [] ? trace_hardirqs_on_caller+0x17b/0x197 [11642.330053] [] __wait_for_common+0x109/0x147 [11642.330952] [] ? __wait_for_common+0x109/0x147 [11642.331869] [] ? usleep_range+0x4a/0x4a [11642.332925] [] ? wake_up_q+0x47/0x47 [11642.333736] [] wait_for_completion+0x24/0x26 [11642.334672] [] btrfs_wait_ordered_extents+0x1c8/0x217 [btrfs] [11642.335858] [] btrfs_mksubvol+0x224/0x45d [btrfs] [11642.336854] [] ? add_wait_queue_exclusive+0x44/0x44 [11642.337820] [] btrfs_ioctl_snap_create_transid+0x148/0x17a [btrfs] [11642.339026] [] btrfs_ioctl_snap_create_v2+0xc7/0x110 [btrfs] [11642.340214] [] btrfs_ioctl+0x590/0x27bd [btrfs] [11642.341123] [] ? mutex_unlock+0xe/0x10 [11642.341934] [] ? ext4_file_write_iter+0x2a3/0x36f [ext4] [11642.342936] [] ? __lock_is_held+0x3c/0x57 [11642.343772] [] ? rcu_read_unlock+0x3e/0x5d [11642.344673] [] do_vfs_ioctl+0x458/0x4dc [11642.346024] [] ? __fget_light+0x62/0x71 [11642.346873] [] SyS_ioctl+0x57/0x79 [11642.347720] [] entry_SYSCALL_64_fastpath+0x12/0x6b [11642.350222] 4 locks held by fdm-stress/26849: [11642.350898] #0: (sb_writers#11){.+.+.+}, at: [] __sb_start_write+0x5f/0xb0 [11642.352375] #1: (&type->i_mutex_dir_key#4/1){+.+.+.}, at: [] btrfs_mksubvol+0x4b/0x45d [btrfs] [11642.354072] #2: (&fs_info->subvol_sem){++++..}, at: [] btrfs_mksubvol+0xf4/0x45d [btrfs] [11642.355647] #3: (&root->ordered_extent_mutex){+.+...}, at: [] btrfs_wait_ordered_extents+0x50/0x217 [btrfs] [11642.357516] INFO: task fdm-stress:26850 blocked for more than 120 seconds. [11642.358508] Not tainted 4.4.0-rc6-btrfs-next-21+ #1 [11642.359376] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11642.368625] fdm-stress D ffff88021f167688 0 26850 26591 0x00000000 [11642.369716] ffff88021f167688 0000000000000001 0000000000014ec0 ffff88023edd4ec0 [11642.370950] ffff880128a98680 ffff88021f168000 ffff88023edd4ec0 7fffffffffffffff [11642.372210] 0000000000000002 ffffffff8147b7f9 ffff88021f1676a0 ffffffff8147b541 [11642.373430] Call Trace: [11642.373853] [] ? bit_wait+0x2f/0x2f [11642.374623] [] schedule+0x82/0x9a [11642.375948] [] schedule_timeout+0x43/0x109 [11642.376862] [] ? bit_wait+0x2f/0x2f [11642.377637] [] ? trace_hardirqs_on_caller+0x17b/0x197 [11642.378610] [] ? trace_hardirqs_on+0xd/0xf [11642.379457] [] ? timekeeping_get_ns+0xe/0x33 [11642.380366] [] ? ktime_get+0x41/0x52 [11642.381353] [] io_schedule_timeout+0xa0/0x102 [11642.382255] [] ? io_schedule_timeout+0xa0/0x102 [11642.383162] [] bit_wait_io+0x1b/0x39 [11642.383945] [] __wait_on_bit_lock+0x4c/0x90 [11642.384875] [] __lock_page+0x66/0x68 [11642.385749] [] ? autoremove_wake_function+0x3a/0x3a [11642.386721] [] lock_page+0x31/0x34 [btrfs] [11642.387596] [] extent_write_cache_pages.isra.19.constprop.35+0x1af/0x2f4 [btrfs] [11642.389030] [] extent_writepages+0x4b/0x5c [btrfs] [11642.389973] [] ? rcu_read_lock_sched_held+0x61/0x69 [11642.390939] [] ? btrfs_writepage_start_hook+0xce/0xce [btrfs] [11642.392271] [] ? __clear_extent_bit+0x26e/0x2c0 [btrfs] [11642.393305] [] btrfs_writepages+0x28/0x2a [btrfs] [11642.394239] [] do_writepages+0x23/0x2c [11642.395045] [] __filemap_fdatawrite_range+0x5a/0x61 [11642.395991] [] filemap_fdatawrite_range+0x13/0x15 [11642.397144] [] btrfs_start_ordered_extent+0xd0/0x1a1 [btrfs] [11642.398392] [] ? clear_extent_bit+0x17/0x19 [btrfs] [11642.399363] [] btrfs_get_blocks_direct+0x12b/0x61c [btrfs] [11642.400445] [] ? dio_bio_add_page+0x3d/0x54 [11642.401309] [] ? submit_page_section+0x7b/0x111 [11642.402213] [] do_blockdev_direct_IO+0x685/0xc24 [11642.403139] [] ? btrfs_page_exists_in_range+0x1a1/0x1a1 [btrfs] [11642.404360] [] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs] [11642.406187] [] __blockdev_direct_IO+0x31/0x33 [11642.407070] [] ? __blockdev_direct_IO+0x31/0x33 [11642.407990] [] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs] [11642.409192] [] btrfs_direct_IO+0x1c7/0x27e [btrfs] [11642.410146] [] ? btrfs_get_extent_fiemap+0x1c0/0x1c0 [btrfs] [11642.411291] [] generic_file_read_iter+0x89/0x4e1 [11642.412263] [] ? mark_lock+0x24/0x201 [11642.413057] [] __vfs_read+0x79/0x9d [11642.413897] [] vfs_read+0x8f/0xd2 [11642.414708] [] SyS_read+0x50/0x7e [11642.415573] [] entry_SYSCALL_64_fastpath+0x12/0x6b [11642.416572] 1 lock held by fdm-stress/26850: [11642.417345] #0: (&f->f_pos_lock){+.+.+.}, at: [] __fdget_pos+0x3a/0x40 [11642.418703] INFO: task fdm-stress:26851 blocked for more than 120 seconds. [11642.419698] Not tainted 4.4.0-rc6-btrfs-next-21+ #1 [11642.420612] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [11642.421807] fdm-stress D ffff880196483d28 0 26851 26591 0x00000000 [11642.422878] ffff880196483d28 00ff8801c8f60740 0000000000014ec0 ffff88023ed94ec0 [11642.424149] ffff8801c8f60740 ffff880196484000 0000000000000246 ffff8801c8f60740 [11642.425374] ffff8801bb711840 ffff8801bb711878 ffff880196483d40 ffffffff8147b541 [11642.426591] Call Trace: [11642.427013] [] schedule+0x82/0x9a [11642.427856] [] schedule_preempt_disabled+0x18/0x24 [11642.428852] [] mutex_lock_nested+0x1d7/0x3b4 [11642.429743] [] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs] [11642.430911] [] btrfs_wait_ordered_extents+0x50/0x217 [btrfs] [11642.432102] [] ? btrfs_wait_ordered_roots+0x57/0x191 [btrfs] [11642.433259] [] ? btrfs_wait_ordered_extents+0x50/0x217 [btrfs] [11642.434431] [] btrfs_wait_ordered_roots+0xcd/0x191 [btrfs] [11642.436079] [] btrfs_sync_fs+0xe0/0x1ad [btrfs] [11642.437009] [] ? SyS_tee+0x23c/0x23c [11642.437860] [] sync_fs_one_sb+0x20/0x22 [11642.438723] [] iterate_supers+0x75/0xc2 [11642.439597] [] sys_sync+0x52/0x80 [11642.440454] [] entry_SYSCALL_64_fastpath+0x12/0x6b [11642.441533] 3 locks held by fdm-stress/26851: [11642.442370] #0: (&type->s_umount_key#37){+++++.}, at: [] iterate_supers+0x5f/0xc2 [11642.444043] #1: (&fs_info->ordered_operations_mutex){+.+...}, at: [] btrfs_wait_ordered_roots+0x44/0x191 [btrfs] [11642.446010] #2: (&root->ordered_extent_mutex){+.+...}, at: [] btrfs_wait_ordered_extents+0x50/0x217 [btrfs] This happened because under specific timings the path for direct IO reads can deadlock with concurrent buffered writes. The diagram below shows how this happens for an example file that has the following layout: [ extent A ] [ extent B ] [ .... 0K 4K 8K CPU 1 CPU 2 CPU 3 DIO read against range [0K, 8K[ starts btrfs_direct_IO() --> calls btrfs_get_blocks_direct() which finds the extent map for the extent A and leaves the range [0K, 4K[ locked in the inode's io tree buffered write against range [4K, 8K[ starts __btrfs_buffered_write() --> dirties page at 4K a user space task calls sync for e.g or writepages() is invoked by mm writepages() run_delalloc_range() cow_file_range() --> ordered extent X for the buffered write is created and writeback starts --> calls btrfs_get_blocks_direct() again, without submitting first a bio for reading extent A, and finds the extent map for extent B --> calls lock_extent_direct() --> locks range [4K, 8K[ --> finds ordered extent X covering range [4K, 8K[ --> unlocks range [4K, 8K[ buffered write against range [0K, 8K[ starts __btrfs_buffered_write() prepare_pages() --> locks pages with offsets 0 and 4K lock_and_cleanup_extent_if_need() --> blocks attempting to lock range [0K, 8K[ in the inode's io tree, because the range [0, 4K[ is already locked by the direct IO task at CPU 1 --> calls btrfs_start_ordered_extent(oe X) btrfs_start_ordered_extent(oe X) --> At this point writeback for ordered extent X has not finished yet filemap_fdatawrite_range() btrfs_writepages() extent_writepages() extent_write_cache_pages() --> finds page with offset 0 with the writeback tag (and not dirty) --> tries to lock it --> deadlock, task at CPU 2 has the page locked and is blocked on the io range [0, 4K[ that was locked earlier by this task So fix this by falling back to a buffered read in the direct IO read path when an ordered extent for a buffered write is found. Signed-off-by: Filipe Manana Reviewed-by: Liu Bo Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 4165ca37d62621cbf4e3ea4eea3bb9aa4f8c32b6 Author: Filipe Manana Date: Fri Feb 12 14:44:00 2016 +0000 Btrfs: fix extent_same allowing destination offset beyond i_size commit f4dfe6871006c62abdccc77b2818b11f376e98e2 upstream. When using the same file as the source and destination for a dedup (extent_same ioctl) operation we were allowing it to dedup to a destination offset beyond the file's size, which doesn't make sense and it's not allowed for the case where the source and destination files are not the same file. This made de deduplication operation successful only when the source range corresponded to a hole, a prealloc extent or an extent with all bytes having a value of 0x00. This was also leaving a file hole (between i_size and destination offset) without the corresponding file extent items, which can be reproduced with the following steps for example: $ mkfs.btrfs -f /dev/sdi $ mount /dev/sdi /mnt/sdi $ xfs_io -f -c "pwrite -S 0xab 304457 404990" /mnt/sdi/foobar wrote 404990/404990 bytes at offset 304457 395 KiB, 99 ops; 0.0000 sec (31.150 MiB/sec and 7984.5149 ops/sec) $ /git/hub/duperemove/btrfs-extent-same 24576 /mnt/sdi/foobar 28672 /mnt/sdi/foobar 929792 Deduping 2 total files (28672, 24576): /mnt/sdi/foobar (929792, 24576): /mnt/sdi/foobar 1 files asked to be deduped i: 0, status: 0, bytes_deduped: 24576 24576 total bytes deduped in this operation $ umount /mnt/sdi $ btrfsck /dev/sdi Checking filesystem on /dev/sdi UUID: 98c528aa-0833-427d-9403-b98032ffbf9d checking extents checking free space cache checking fs roots root 5 inode 257 errors 100, file extent discount Found file extent holes: start: 712704, len: 217088 found 540673 bytes used err is 1 total csum bytes: 400 total tree bytes: 131072 total fs tree bytes: 32768 total extent tree bytes: 16384 btree space waste bytes: 123675 file data blocks allocated: 671744 referenced 671744 btrfs-progs v4.2.3 So fix this by not allowing the destination to go beyond the file's size, just as we do for the same where the source and destination files are not the same. A test for xfstests follows. Signed-off-by: Filipe Manana Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 36ff8607d6b0986289bf94bd9d110a1e0feeac38 Author: Filipe Manana Date: Fri Feb 12 11:34:23 2016 +0000 Btrfs: fix file loss on log replay after renaming a file and fsync commit 2be63d5ce929603d4e7cedabd9e992eb34a0ff95 upstream. We have two cases where we end up deleting a file at log replay time when we should not. For this to happen the file must have been renamed and a directory inode must have been fsynced/logged. Two examples that exercise these two cases are listed below. Case 1) $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir -p /mnt/a/b $ mkdir /mnt/c $ touch /mnt/a/b/foo $ sync $ mv /mnt/a/b/foo /mnt/c/ # Create file bar just to make sure the fsync on directory a/ does # something and it's not a no-op. $ touch /mnt/a/bar $ xfs_io -c "fsync" /mnt/a < power fail / crash > The next time the filesystem is mounted, the log replay procedure deletes file foo. Case 2) $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ mkdir /mnt/a $ mkdir /mnt/b $ mkdir /mnt/c $ touch /mnt/a/foo $ ln /mnt/a/foo /mnt/b/foo_link $ touch /mnt/b/bar $ sync $ unlink /mnt/b/foo_link $ mv /mnt/b/bar /mnt/c/ $ xfs_io -c "fsync" /mnt/a/foo < power fail / crash > The next time the filesystem is mounted, the log replay procedure deletes file bar. The reason why the files are deleted is because when we log inodes other then the fsync target inode, we ignore their last_unlink_trans value and leave the log without enough information to later replay the rename operations. So we need to look at the last_unlink_trans values and fallback to a transaction commit if they are greater than the id of the last committed transaction. So fix this by looking at the last_unlink_trans values and fallback to transaction commits when needed. Also, when logging other inodes (for case 1 we logged descendants of the fsync target inode while for case 2 we logged ascendants) we need to care about concurrent tasks updating the last_unlink_trans of inodes we are logging (which was already an existing problem in check_parent_dirs_for_sync()). Since we can not acquire their inode mutex (vfs' struct inode ->i_mutex), as that causes deadlocks with other concurrent operations that acquire the i_mutex of 2 inodes (other fsyncs or renames for example), we need to serialize on the log_mutex of the inode we are logging. A task setting a new value for an inode's last_unlink_trans must acquire the inode's log_mutex and it must do this update before doing the actual unlink operation (which is already the case except when deleting a snapshot). Conversely the task logging the inode must first log the inode and then check the inode's last_unlink_trans value while holding its log_mutex, as if its value is not greater then the id of the last committed transaction it means it logged a safe state of the inode's items, while if its value is not smaller then the id of the last committed transaction it means the inode state it has logged might not be safe (the concurrent task might have just updated last_unlink_trans but hasn't done yet the unlink operation) and therefore a transaction commit must be done. Test cases for xfstests follow in separate patches. Signed-off-by: Filipe Manana Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 29b3f509c92a94cd7751db8b96c29d6c8de6bf11 Author: Filipe Manana Date: Wed Feb 10 10:42:25 2016 +0000 Btrfs: fix unreplayable log after snapshot delete + parent dir fsync commit 1ec9a1ae1e30c733077c0b288c4301b66b7a81f2 upstream. If we delete a snapshot, fsync its parent directory and crash/power fail before the next transaction commit, on the next mount when we attempt to replay the log tree of the root containing the parent directory we will fail and prevent the filesystem from mounting, which is solvable by wiping out the log trees with the btrfs-zero-log tool but very inconvenient as we will lose any data and metadata fsynced before the parent directory was fsynced. For example: $ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt $ mkdir /mnt/testdir $ btrfs subvolume snapshot /mnt /mnt/testdir/snap $ btrfs subvolume delete /mnt/testdir/snap $ xfs_io -c "fsync" /mnt/testdir < crash / power failure and reboot > $ mount /dev/sdc /mnt mount: mount(2) failed: No such file or directory And in dmesg/syslog we get the following message and trace: [192066.361162] BTRFS info (device dm-0): failed to delete reference to snap, inode 257 parent 257 [192066.363010] ------------[ cut here ]------------ [192066.365268] WARNING: CPU: 4 PID: 5130 at fs/btrfs/inode.c:3986 __btrfs_unlink_inode+0x17a/0x354 [btrfs]() [192066.367250] BTRFS: Transaction aborted (error -2) [192066.368401] Modules linked in: btrfs dm_flakey dm_mod ppdev sha256_generic xor raid6_pq hmac drbg ansi_cprng aesni_intel acpi_cpufreq tpm_tis aes_x86_64 tpm ablk_helper evdev cryptd sg parport_pc i2c_piix4 psmouse lrw parport i2c_core pcspkr gf128mul processor serio_raw glue_helper button loop autofs4 ext4 crc16 mbcache jbd2 sd_mod sr_mod cdrom ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring crc32c_intel scsi_mod e1000 virtio floppy [last unloaded: btrfs] [192066.377154] CPU: 4 PID: 5130 Comm: mount Tainted: G W 4.4.0-rc6-btrfs-next-20+ #1 [192066.378875] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014 [192066.380889] 0000000000000000 ffff880143923670 ffffffff81257570 ffff8801439236b8 [192066.382561] ffff8801439236a8 ffffffff8104ec07 ffffffffa039dc2c 00000000fffffffe [192066.384191] ffff8801ed31d000 ffff8801b9fc9c88 ffff8801086875e0 ffff880143923710 [192066.385827] Call Trace: [192066.386373] [] dump_stack+0x4e/0x79 [192066.387387] [] warn_slowpath_common+0x99/0xb2 [192066.388429] [] ? __btrfs_unlink_inode+0x17a/0x354 [btrfs] [192066.389236] [] warn_slowpath_fmt+0x48/0x50 [192066.389884] [] __btrfs_unlink_inode+0x17a/0x354 [btrfs] [192066.390621] [] ? iput+0xb0/0x266 [192066.391200] [] btrfs_unlink_inode+0x1c/0x3d [btrfs] [192066.391930] [] check_item_in_log+0x1fe/0x29b [btrfs] [192066.392715] [] replay_dir_deletes+0x167/0x1cf [btrfs] [192066.393510] [] replay_one_buffer+0x417/0x570 [btrfs] [192066.394241] [] walk_up_log_tree+0x10e/0x1dc [btrfs] [192066.394958] [] walk_log_tree+0xa5/0x190 [btrfs] [192066.395628] [] btrfs_recover_log_trees+0x239/0x32c [btrfs] [192066.396790] [] ? replay_one_extent+0x50a/0x50a [btrfs] [192066.397891] [] open_ctree+0x1d8b/0x2167 [btrfs] [192066.398897] [] btrfs_mount+0x5ef/0x729 [btrfs] [192066.399823] [] ? trace_hardirqs_on+0xd/0xf [192066.400739] [] ? lockdep_init_map+0xb9/0x1b3 [192066.401700] [] mount_fs+0x67/0x131 [192066.402482] [] vfs_kern_mount+0x6c/0xde [192066.403930] [] btrfs_mount+0x1cb/0x729 [btrfs] [192066.404831] [] ? trace_hardirqs_on+0xd/0xf [192066.405726] [] ? lockdep_init_map+0xb9/0x1b3 [192066.406621] [] mount_fs+0x67/0x131 [192066.407401] [] vfs_kern_mount+0x6c/0xde [192066.408247] [] do_mount+0x893/0x9d2 [192066.409047] [] ? strndup_user+0x3f/0x8c [192066.409842] [] SyS_mount+0x75/0xa1 [192066.410621] [] entry_SYSCALL_64_fastpath+0x12/0x6b [192066.411572] ---[ end trace 2de42126c1e0a0f0 ]--- [192066.412344] BTRFS: error (device dm-0) in __btrfs_unlink_inode:3986: errno=-2 No such entry [192066.413748] BTRFS: error (device dm-0) in btrfs_replay_log:2464: errno=-2 No such entry (Failed to recover log tree) [192066.415458] BTRFS error (device dm-0): cleaner transaction attach returned -30 [192066.444613] BTRFS: open_ctree failed This happens because when we are replaying the log and processing the directory entry pointing to the snapshot in the subvolume tree, we treat its btrfs_dir_item item as having a location with a key type matching BTRFS_INODE_ITEM_KEY, which is wrong because the type matches BTRFS_ROOT_ITEM_KEY and therefore must be processed differently, as the object id refers to a root number and not to an inode in the root containing the parent directory. So fix this by triggering a transaction commit if an fsync against the parent directory is requested after deleting a snapshot. This is the simplest approach for a rare use case. Some alternative that avoids the transaction commit would require more code to explicitly delete the snapshot at log replay time (factoring out common code from ioctl.c: btrfs_ioctl_snap_destroy()), special care at fsync time to remove the log tree of the snapshot's root from the log root of the root of tree roots, amongst other steps. A test case for xfstests that triggers the issue follows. seq=`basename $0` seqres=$RESULT_DIR/$seq echo "QA output created by $seq" tmp=/tmp/$$ status=1 # failure is the default! trap "_cleanup; exit \$status" 0 1 2 3 15 _cleanup() { _cleanup_flakey cd / rm -f $tmp.* } # get standard environment, filters and checks . ./common/rc . ./common/filter . ./common/dmflakey # real QA test starts here _need_to_be_root _supported_fs btrfs _supported_os Linux _require_scratch _require_dm_target flakey _require_metadata_journaling $SCRATCH_DEV rm -f $seqres.full _scratch_mkfs >>$seqres.full 2>&1 _init_flakey _mount_flakey # Create a snapshot at the root of our filesystem (mount point path), delete it, # fsync the mount point path, crash and mount to replay the log. This should # succeed and after the filesystem is mounted the snapshot should not be visible # anymore. _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/snap1 _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/snap1 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT _flakey_drop_and_remount [ -e $SCRATCH_MNT/snap1 ] && \ echo "Snapshot snap1 still exists after log replay" # Similar scenario as above, but this time the snapshot is created inside a # directory and not directly under the root (mount point path). mkdir $SCRATCH_MNT/testdir _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT $SCRATCH_MNT/testdir/snap2 _run_btrfs_util_prog subvolume delete $SCRATCH_MNT/testdir/snap2 $XFS_IO_PROG -c "fsync" $SCRATCH_MNT/testdir _flakey_drop_and_remount [ -e $SCRATCH_MNT/testdir/snap2 ] && \ echo "Snapshot snap2 still exists after log replay" _unmount_flakey echo "Silence is golden" status=0 exit Signed-off-by: Filipe Manana Tested-by: Liu Bo Reviewed-by: Liu Bo Signed-off-by: Chris Mason Signed-off-by: Greg Kroah-Hartman commit 259d2fbd9de65c6a74b3b24e3108b1b1e503849b Author: David Sterba Date: Thu Oct 8 14:14:16 2015 +0200 btrfs: change max_inline default to 2048 commit f7e98a7fff8634ae655c666dc2c9fc55a48d0a73 upstream. The current practical default is ~4k on x86_64 (the logic is more complex, simplified for brevity), the inlined files land in the metadata group and thus consume space that could be needed for the real metadata. The inlining brings some usability surprises: 1) total space consumption measured on various filesystems and btrfs with DUP metadata was quite visible because of the duplicated data within metadata 2) inlined data may exhaust the metadata, which are more precious in case the entire device space is allocated to chunks (ie. balance cannot make the space more compact) 3) performance suffers a bit as the inlined blocks are duplicate and stored far away on the device. Proposed fix: set the default to 2048 This fixes namely 1), the total filesysystem space consumption will be on par with other filesystems. Partially fixes 2), more data are pushed to the data block groups. The characteristics of 3) are based on actual small file size distribution. The change is independent of the metadata blockgroup type (though it's most visible with DUP) or system page size as these parameters are not trival to find out, compared to file size. Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 202efaae84a571169bb9f0b17618eb7f718675cc Author: David Sterba Date: Thu Feb 11 15:30:07 2016 +0100 btrfs: remove error message from search ioctl for nonexistent tree commit 11ea474f74709fc764fb7e80306e0776f94ce8b8 upstream. Let's remove the error message that appears when the tree_id is not present. This can happen with the quota tree and has been observed in practice. The applications are supposed to handle -ENOENT and we don't need to report that in the system log as it's not a fatal error. Reported-by: Vlastimil Babka Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 94863434d5f90aae0a956f7c2b753f959dcc0c85 Author: Josef Bacik Date: Wed Jan 13 11:48:06 2016 -0500 Btrfs: fix truncate_space_check commit dc95f7bfc57fa4b75a77d0da84d5db249d74aa3f upstream. truncate_space_check is using btrfs_csum_bytes_to_leaves() but forgetting to multiply by nodesize so we get an actual byte count. We need a tracepoint here so that we have the matching reserve for the release that will come later. Also add a comment to make clear what the intent of truncate_space_check is. Signed-off-by: Josef Bacik Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit 94ad29d2ad1010e9966b385a3d4151fb475cfc61 Author: Zhao Lei Date: Fri Dec 18 21:33:05 2015 +0800 btrfs: reada: Fix in-segment calculation for reada commit 503785306d182ab624a2232856ef8ab503ee85f9 upstream. reada_zone->end is end pos of segment: end = start + cache->key.offset - 1; So we need to use "<=" in condition to judge is a pos in the segment. The problem happened rearly, because logical pos rarely pointed to last 4k of a blockgroup, but we need to fix it to make code right in logic. Signed-off-by: Zhao Lei Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman commit e23744bfad9fad9d1991298a449342e6f76dd1a6 Author: Alex Deucher Date: Wed May 11 16:21:03 2016 -0400 drm/amdgpu: fix DP mode validation commit c47b9e0944e483309d66c807d650ac8b8ceafb57 upstream. Switch the order of the loops to walk the rates on the top so we exhaust all DP 1.1 rate/lane combinations before trying DP 1.2 rate/lane combos. This avoids selecting rates that are supported by the monitor, but not the connector leading to valid modes getting rejected. bug: https://bugs.freedesktop.org/show_bug.cgi?id=95206 Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit df6ac4bcc9c820841b17b9af0f8c270b54e485cb Author: Alex Deucher Date: Wed May 11 16:16:53 2016 -0400 drm/radeon: fix DP mode validation commit ff0bd441bdfbfa09d05fdba9829a0401a46635c1 upstream. Switch the order of the loops to walk the rates on the top so we exhaust all DP 1.1 rate/lane combinations before trying DP 1.2 rate/lane combos. This avoids selecting rates that are supported by the monitor, but not the connector leading to valid modes getting rejected. bug: https://bugs.freedesktop.org/show_bug.cgi?id=95206 Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 0f1b8876ab01c393da876070f8a970165db5e158 Author: Arindam Nath Date: Wed May 4 23:39:59 2016 +0530 drm/radeon: fix DP link training issue with second 4K monitor commit 1a738347df2ee4977459a8776fe2c62196bdcb1b upstream. There is an issue observed when we hotplug a second DP 4K monitor to the system. Sometimes, the link training fails for the second monitor after HPD interrupt generation. The issue happens when some queued or deferred transactions are already present on the AUX channel when we initiate a new transcation to (say) get DPCD or during link training. We set AUX_IGNORE_HPD_DISCON bit in the AUX_CONTROL register so that we can ignore any such deferred transactions when a new AUX transaction is initiated. Signed-off-by: Arindam Nath Reviewed-by: Alex Deucher Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit 9907c72bc2428326c31280568875d78ce848e2b9 Author: Imre Deak Date: Tue May 3 15:54:19 2016 +0300 drm/i915/bdw: Add missing delay during L3 SQC credit programming commit d6a862fe8c48229ba342648bcd535b2404724603 upstream. BSpec requires us to wait ~100 clocks before re-enabling clock gating, so make sure we do this. CC: Ville Syrjälä Signed-off-by: Imre Deak Reviewed-by: Ville Syrjälä Link: http://patchwork.freedesktop.org/patch/msgid/1462280061-1457-2-git-send-email-imre.deak@intel.com (cherry picked from commit 48e5d68d28f00c0cadac5a830980ff3222781abb) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman commit 6c1827a97251893cffd513e83c9b5ef5a20d254d Author: Lyude Date: Tue May 3 11:01:32 2016 -0400 Revert "drm/i915: start adding dp mst audio" commit 26792526cc3e29e3ccbc15c996beb61fa64be5af upstream. Right now MST audio is causing too many kernel panics to really keep around in the kernel. On top of that, even after fixing said panics it's still basically non-functional (at least on all the setups I've tested it on). Revert until we have a proper solution for this. This reverts commit 3d52ccf52f2c51f613e42e65be0f06e4e6788093. Signed-off-by: Lyude Fixes: 3d52ccf52f2c ("drm/i915: start adding dp mst audio") Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1462287692-28570-1-git-send-email-cpaul@redhat.com (cherry picked from commit 5a8f97ea04c98201deeb973c3f711c3c156115e9) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman commit 36d8fb8f65ad431fcb26142d6f871de20168fb8e Author: Daniel Vetter Date: Tue May 3 10:33:01 2016 +0200 drm/i915: Bail out of pipe config compute loop on LPT commit 2700818ac9f935d8590715eecd7e8cadbca552b6 upstream. LPT is pch, so might run into the fdi bandwidth constraint (especially since it has only 2 lanes). But right now we just force pipe_bpp back to 24, resulting in a nice loop (which we bail out with a loud WARN_ON). Fix this. Cc: Chris Wilson Cc: Maarten Lankhorst References: https://bugs.freedesktop.org/show_bug.cgi?id=93477 Signed-off-by: Daniel Vetter Tested-by: Chris Wilson Signed-off-by: Maarten Lankhorst Signed-off-by: Daniel Vetter Link: http://patchwork.freedesktop.org/patch/msgid/1462264381-7573-1-git-send-email-daniel.vetter@ffwll.ch (cherry picked from commit f58a1acc7e4a1f37d26124ce4c875c647fbcc61f) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman commit eca3d50f42068aaa949b633d70ea59914bb0b8dd Author: Lucas Stach Date: Thu May 5 10:16:44 2016 -0400 drm/radeon: fix PLL sharing on DCE6.1 (v2) commit e3c00d87845ab375f90fa6e10a5e72a3a5778cd3 upstream. On DCE6.1 PPLL2 is exclusively available to UNIPHYA, so it should not be taken into consideration when looking for an already enabled PLL to be shared with other outputs. This fixes the broken VGA port (TRAVIS DP->VGA bridge) on my Richland based laptop, where the internal display is connected to UNIPHYA through a TRAVIS DP->LVDS bridge. Bug: https://bugs.freedesktop.org/show_bug.cgi?id=78987 v2: agd: add check in radeon_get_shared_nondp_ppll as well, drop extra parameter. Signed-off-by: Lucas Stach Signed-off-by: Alex Deucher Signed-off-by: Greg Kroah-Hartman commit be19315585fa23cc9a2758e9f29f01b1095e134f Author: Ville Syrjälä Date: Tue Apr 26 19:46:32 2016 +0300 drm/i915: Update CDCLK_FREQ register on BDW after changing cdclk frequency commit a04e23d42a1ce5d5f421692bb1c7e9352832819d upstream. Update CDCLK_FREQ on BDW after changing the cdclk frequency. Not sure if this is a late addition to the spec, or if I simply overlooked this step when writing the original code. This is what Bspec has to say about CDCLK_FREQ: "Program this field to the CD clock frequency minus one. This is used to generate a divided down clock for miscellaneous timers in display." And the "Broadwell Sequences for Changing CD Clock Frequency" section clarifies this further: "For CD clock 337.5 MHz, program 337 decimal. For CD clock 450 MHz, program 449 decimal. For CD clock 540 MHz, program 539 decimal. For CD clock 675 MHz, program 674 decimal." Cc: Mika Kahola Fixes: b432e5cfd5e9 ("drm/i915: BDW clock change support") Signed-off-by: Ville Syrjälä Link: http://patchwork.freedesktop.org/patch/msgid/1461689194-6079-2-git-send-email-ville.syrjala@linux.intel.com Reviewed-by: Mika Kahola (cherry picked from commit 7f1052a8fa38df635ab0dc0e6025b64ab9834824) Signed-off-by: Jani Nikula Signed-off-by: Greg Kroah-Hartman commit 1943bd0f10074f9b4636a15e72c00d4e5142a8aa Author: Mauro Carvalho Chehab Date: Wed May 11 13:09:34 2016 -0300 Revert "[media] videobuf2-v4l2: Verify planes array in buffer dequeueing" commit 93f0750dcdaed083d6209b01e952e98ca730db66 upstream. This patch causes a Kernel panic when called on a DVB driver. This was also reported by David R : May 7 14:47:35 server kernel: [ 501.247123] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 May 7 14:47:35 server kernel: [ 501.247239] IP: [] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.247354] PGD cae6f067 PUD ca99c067 PMD 0 May 7 14:47:35 server kernel: [ 501.247426] Oops: 0000 [#1] SMP May 7 14:47:35 server kernel: [ 501.247482] Modules linked in: xfs tun xt_connmark xt_TCPMSS xt_tcpmss xt_owner xt_REDIRECT nf_nat_redirect xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 ts_kmp ts_bm xt_string ipt_REJECT nf_reject_ipv4 xt_recent xt_conntrack xt_multiport xt_pkttype xt_tcpudp xt_mark nf_log_ipv4 nf_log_common xt_LOG xt_limit iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter ip_tables ip6table_filter ip6_tables x_tables pppoe pppox dm_crypt ts2020 regmap_i2c ds3000 cx88_dvb dvb_pll cx88_vp3054_i2c mt352 videobuf2_dvb cx8800 cx8802 cx88xx pl2303 tveeprom videobuf2_dma_sg ppdev videobuf2_memops videobuf2_v4l2 videobuf2_core dvb_usb_digitv snd_hda_codec_via snd_hda_codec_hdmi snd_hda_codec_generic radeon dvb_usb snd_hda_intel amd64_edac_mod serio_raw snd_hda_codec edac_core fbcon k10temp bitblit softcursor snd_hda_core font snd_pcm_oss i2c_piix4 snd_mixer_oss tileblit drm_kms_helper syscopyarea snd_pcm snd_seq_dummy sysfillrect snd_seq_oss sysimgblt fb_sys_fops ttm snd_seq_midi r8169 snd_rawmidi drm snd_seq_midi_event e1000e snd_seq snd_seq_device snd_timer snd ptp pps_core i2c_algo_bit soundcore parport_pc ohci_pci shpchp tpm_tis tpm nfsd auth_rpcgss oid_registry hwmon_vid exportfs nfs_acl mii nfs bonding lockd grace lp sunrpc parport May 7 14:47:35 server kernel: [ 501.249564] CPU: 1 PID: 6889 Comm: vb2-cx88[0] Not tainted 4.5.3 #3 May 7 14:47:35 server kernel: [ 501.249644] Hardware name: System manufacturer System Product Name/M4A785TD-V EVO, BIOS 0211 07/08/2009 May 7 14:47:35 server kernel: [ 501.249767] task: ffff8800aebf3600 ti: ffff8801e07a0000 task.ti: ffff8801e07a0000 May 7 14:47:35 server kernel: [ 501.249861] RIP: 0010:[] [] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.250002] RSP: 0018:ffff8801e07a3de8 EFLAGS: 00010086 May 7 14:47:35 server kernel: [ 501.250071] RAX: 0000000000000283 RBX: ffff880210dc5000 RCX: 0000000000000283 May 7 14:47:35 server kernel: [ 501.250161] RDX: ffffffffa0222cf0 RSI: 0000000000000000 RDI: ffff880210dc5014 May 7 14:47:35 server kernel: [ 501.250251] RBP: ffff8801e07a3df8 R08: ffff8801e07a0000 R09: 0000000000000000 May 7 14:47:35 server kernel: [ 501.250348] R10: 0000000000000000 R11: 0000000000000001 R12: ffff8800cda2a9d8 May 7 14:47:35 server kernel: [ 501.250438] R13: ffff880210dc51b8 R14: 0000000000000000 R15: ffff8800cda2a828 May 7 14:47:35 server kernel: [ 501.250528] FS: 00007f5b77fff700(0000) GS:ffff88021fc40000(0000) knlGS:00000000adaffb40 May 7 14:47:35 server kernel: [ 501.250631] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b May 7 14:47:35 server kernel: [ 501.250704] CR2: 0000000000000004 CR3: 00000000ca19d000 CR4: 00000000000006e0 May 7 14:47:35 server kernel: [ 501.250794] Stack: May 7 14:47:35 server kernel: [ 501.250822] ffff8801e07a3df8 ffffffffa0222cfd ffff8801e07a3e70 ffffffffa0236beb May 7 14:47:35 server kernel: [ 501.250937] 0000000000000283 ffff8801e07a3e94 0000000000000000 0000000000000000 May 7 14:47:35 server kernel: [ 501.251051] ffff8800aebf3600 ffffffff8108d8e0 ffff8801e07a3e38 ffff8801e07a3e38 May 7 14:47:35 server kernel: [ 501.251165] Call Trace: May 7 14:47:35 server kernel: [ 501.251200] [] ? __verify_planes_array_core+0xd/0x10 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.251306] [] vb2_core_dqbuf+0x2eb/0x4c0 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251398] [] ? prepare_to_wait_event+0x100/0x100 May 7 14:47:35 server kernel: [ 501.251482] [] vb2_thread+0x1cb/0x220 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251569] [] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.251662] [] ? vb2_core_qbuf+0x230/0x230 [videobuf2_core] May 7 14:47:35 server kernel: [ 501.255982] [] kthread+0xc4/0xe0 May 7 14:47:35 server kernel: [ 501.260292] [] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.264615] [] ret_from_fork+0x3f/0x70 May 7 14:47:35 server kernel: [ 501.268962] [] ? kthread_park+0x50/0x50 May 7 14:47:35 server kernel: [ 501.273216] Code: 0d 01 74 16 48 8b 46 28 48 8b 56 30 48 89 87 d0 01 00 00 48 89 97 d8 01 00 00 5d c3 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 <8b> 46 04 48 89 e5 8d 50 f7 31 c0 83 fa 01 76 02 5d c3 48 83 7e May 7 14:47:35 server kernel: [ 501.282146] RIP [] __verify_planes_array.isra.3+0x1/0x80 [videobuf2_v4l2] May 7 14:47:35 server kernel: [ 501.286391] RSP May 7 14:47:35 server kernel: [ 501.290619] CR2: 0000000000000004 May 7 14:47:35 server kernel: [ 501.294786] ---[ end trace b2b354153ccad110 ]--- This reverts commit 2c1f6951a8a82e6de0d82b1158b5e493fc6c54ab. Cc: Sakari Ailus Cc: Hans Verkuil Fixes: 2c1f6951a8a8 ("[media] videobuf2-v4l2: Verify planes array in buffer dequeueing") Signed-off-by: Mauro Carvalho Chehab Signed-off-by: Greg Kroah-Hartman commit 6d8b57ff73188c389d1a3706fd67bb41915bc3ab Author: Marek Szyprowski Date: Mon May 9 09:31:47 2016 -0700 Input: max8997-haptic - fix NULL pointer dereference commit 6ae645d5fa385f3787bf1723639cd907fe5865e7 upstream. NULL pointer derefence happens when booting with DTB because the platform data for haptic device is not set in supplied data from parent MFD device. The MFD device creates only platform data (from Device Tree) for itself, not for haptic child. Unable to handle kernel NULL pointer dereference at virtual address 0000009c pgd = c0004000 [0000009c] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT SMP ARM (max8997_haptic_probe) from [] (platform_drv_probe+0x4c/0xb0) (platform_drv_probe) from [] (driver_probe_device+0x214/0x2c0) (driver_probe_device) from [] (__driver_attach+0xac/0xb0) (__driver_attach) from [] (bus_for_each_dev+0x68/0x9c) (bus_for_each_dev) from [] (bus_add_driver+0x1a0/0x218) (bus_add_driver) from [] (driver_register+0x78/0xf8) (driver_register) from [] (do_one_initcall+0x90/0x1d8) (do_one_initcall) from [] (kernel_init_freeable+0x15c/0x1fc) (kernel_init_freeable) from [] (kernel_init+0x8/0x114) (kernel_init) from [] (ret_from_fork+0x14/0x3c) Signed-off-by: Marek Szyprowski Fixes: 104594b01ce7 ("Input: add driver support for MAX8997-haptic") [k.kozlowski: Write commit message, add CC-stable] Signed-off-by: Krzysztof Kozlowski Signed-off-by: Dmitry Torokhov Signed-off-by: Greg Kroah-Hartman commit f18783e6ab935c8884e24c43d6e5d5c417e06923 Author: Al Viro Date: Thu May 5 16:25:35 2016 -0400 get_rock_ridge_filename(): handle malformed NM entries commit 99d825822eade8d827a1817357cbf3f889a552d6 upstream. Payloads of NM entries are not supposed to contain NUL. When we run into such, only the part prior to the first NUL goes into the concatenation (i.e. the directory entry name being encoded by a bunch of NM entries). We do stop when the amount collected so far + the claimed amount in the current NM entry exceed 254. So far, so good, but what we return as the total length is the sum of *claimed* sizes, not the actual amount collected. And that can grow pretty large - not unlimited, since you'd need to put CE entries in between to be able to get more than the maximum that could be contained in one isofs directory entry / continuation chunk and we are stop once we'd encountered 32 CEs, but you can get about 8Kb easily. And that's what will be passed to readdir callback as the name length. 8Kb __copy_to_user() from a buffer allocated by __get_free_page() Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit 9bd227fabce8cc260697ec1907342ae92473df06 Author: Steven Rostedt Date: Wed May 11 15:09:36 2016 -0400 tools lib traceevent: Do not reassign parg after collapse_tree() commit 106b816cb46ebd87408b4ed99a2e16203114daa6 upstream. At the end of process_filter(), collapse_tree() was changed to update the parg parameter, but the reassignment after the call wasn't removed. What happens is that the "current_op" gets modified and freed and parg is assigned to the new allocated argument. But after the call to collapse_tree(), parg is assigned again to the just freed "current_op", and this causes the tool to crash. The current_op variable must also be assigned to NULL in case of error, otherwise it will cause it to be free()ed twice. Signed-off-by: Steven Rostedt Acked-by: Namhyung Kim Fixes: 42d6194d133c ("tools lib traceevent: Refactor process_filter()") Link: http://lkml.kernel.org/r/20160511150936.678c18a1@gandalf.local.home Signed-off-by: Arnaldo Carvalho de Melo Signed-off-by: Greg Kroah-Hartman commit c9458953f34fc199b01d6d93848b3722c8bba04e Author: Johannes Thumshirn Date: Wed Apr 27 10:48:52 2016 +0200 qla1280: Don't allocate 512kb of host tags commit 2bcbc81421c511ef117cadcf0bee9c4340e68db0 upstream. The qla1280 driver sets the scsi_host_template's can_queue field to 0xfffff which results in an allocation failure when allocating the block layer tags for the driver's queues. This was introduced with the change for host wide tags in commit 64d513ac31b - "scsi: use host wide tags by default". Reduce can_queue to MAX_OUTSTANDING_COMMANDS (512) to solve the allocation error. Signed-off-by: Johannes Thumshirn Fixes: 64d513ac31b - "scsi: use host wide tags by default" Cc: Laura Abbott Cc: Michael Reed Reviewed-by: Laurence Oberman Reviewed-by: Lee Duncan Signed-off-by: Martin K. Petersen Signed-off-by: James Bottomley Signed-off-by: Greg Kroah-Hartman commit aa9c7ecf1b628f4e5a7eaf6adbe6f96c74e73570 Author: Al Viro Date: Wed Apr 27 01:11:55 2016 -0400 atomic_open(): fix the handling of create_error commit 10c64cea04d3c75c306b3f990586ffb343b63287 upstream. * if we have a hashed negative dentry and either CREAT|EXCL on r/o filesystem, or CREAT|TRUNC on r/o filesystem, or CREAT|EXCL with failing may_o_create(), we should fail with EROFS or the error may_o_create() has returned, but not ENOENT. Which is what the current code ends up returning. * if we have CREAT|TRUNC hitting a regular file on a read-only filesystem, we can't fail with EROFS here. At the very least, not until we'd done follow_managed() - we might have a writable file (or a device, for that matter) bound on top of that one. Moreover, the code downstream will see that O_TRUNC and attempt to grab the write access (*after* following possible mount), so if we really should fail with EROFS, it will happen. No need to do that inside atomic_open(). The real logics is much simpler than what the current code is trying to do - if we decided to go for simple lookup, ended up with a negative dentry *and* had create_error set, fail with create_error. No matter whether we'd got that negative dentry from lookup_real() or had found it in dcache. Acked-by: Miklos Szeredi Signed-off-by: Al Viro Signed-off-by: Greg Kroah-Hartman commit c8ace74a3e8801af9ac120698c098f41d6453de8 Author: Hans de Goede Date: Wed Apr 27 15:59:27 2016 +0200 regulator: axp20x: Fix axp22x ldo_io voltage ranges commit a2262e5a12e05389ab4c7fc5cf60016b041dd8dc upstream. The minium voltage of 1800mV is a copy and paste error from the axp20x regulator info. The correct minimum voltage for the ldo_io regulators on the axp22x is 700mV. Fixes: 1b82b4e4f954 ("regulator: axp20x: Add support for AXP22X regulators") Signed-off-by: Hans de Goede Acked-by: Chen-Yu Tsai Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit f64ce5854222226ef9cabb41ed255e070cbc8528 Author: Krzysztof Kozlowski Date: Mon Mar 28 13:09:56 2016 +0900 regulator: s2mps11: Fix invalid selector mask and voltages for buck9 commit 3b672623079bb3e5685b8549e514f2dfaa564406 upstream. The buck9 regulator of S2MPS11 PMIC had incorrect vsel_mask (0xff instead of 0x1f) thus reading entire register as buck9's voltage. This effectively caused regulator core to interpret values as higher voltages than they were and then to set real voltage much lower than intended. The buck9 provides power to other regulators, including LDO13 and LDO19 which supply the MMC2 (SD card). On Odroid XU3/XU4 the lower voltage caused SD card detection errors on Odroid XU3/XU4: mmc1: card never left busy state mmc1: error -110 whilst initialising SD card During driver probe the regulator core was checking whether initial voltage matches the constraints. With incorrect vsel_mask of 0xff and default value of 0x50, the core interpreted this as 5 V which is outside of constraints (3-3.775 V). Then the regulator core was adjusting the voltage to match the constraints. With incorrect vsel_mask this new voltage mapped to a vere low voltage in the driver. Signed-off-by: Krzysztof Kozlowski Reviewed-by: Javier Martinez Canillas Tested-by: Javier Martinez Canillas Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 911cf4b94f7b068f92eab0b02f620a44ad6e405d Author: Wanpeng Li Date: Wed May 11 17:55:18 2016 +0800 workqueue: fix rebind bound workers warning commit f7c17d26f43d5cc1b7a6b896cd2fa24a079739b9 upstream. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 16 at kernel/workqueue.c:4559 rebind_workers+0x1c0/0x1d0 Modules linked in: CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 4.6.0-rc4+ #31 Hardware name: IBM IBM System x3550 M4 Server -[7914IUW]-/00Y8603, BIOS -[D7E128FUS-1.40]- 07/23/2013 0000000000000000 ffff881037babb58 ffffffff8139d885 0000000000000010 0000000000000000 0000000000000000 0000000000000000 ffff881037babba8 ffffffff8108505d ffff881037ba0000 000011cf3e7d6e60 0000000000000046 Call Trace: dump_stack+0x89/0xd4 __warn+0xfd/0x120 warn_slowpath_null+0x1d/0x20 rebind_workers+0x1c0/0x1d0 workqueue_cpu_up_callback+0xf5/0x1d0 notifier_call_chain+0x64/0x90 ? trace_hardirqs_on_caller+0xf2/0x220 ? notify_prepare+0x80/0x80 __raw_notifier_call_chain+0xe/0x10 __cpu_notify+0x35/0x50 notify_down_prepare+0x5e/0x80 ? notify_prepare+0x80/0x80 cpuhp_invoke_callback+0x73/0x330 ? __schedule+0x33e/0x8a0 cpuhp_down_callbacks+0x51/0xc0 cpuhp_thread_fun+0xc1/0xf0 smpboot_thread_fn+0x159/0x2a0 ? smpboot_create_threads+0x80/0x80 kthread+0xef/0x110 ? wait_for_completion+0xf0/0x120 ? schedule_tail+0x35/0xf0 ret_from_fork+0x22/0x50 ? __init_kthread_worker+0x70/0x70 ---[ end trace eb12ae47d2382d8f ]--- notify_down_prepare: attempt to take down CPU 0 failed This bug can be reproduced by below config w/ nohz_full= all cpus: CONFIG_BOOTPARAM_HOTPLUG_CPU0=y CONFIG_DEBUG_HOTPLUG_CPU0=y CONFIG_NO_HZ_FULL=y As Thomas pointed out: | If a down prepare callback fails, then DOWN_FAILED is invoked for all | callbacks which have successfully executed DOWN_PREPARE. | | But, workqueue has actually two notifiers. One which handles | UP/DOWN_FAILED/ONLINE and one which handles DOWN_PREPARE. | | Now look at the priorities of those callbacks: | | CPU_PRI_WORKQUEUE_UP = 5 | CPU_PRI_WORKQUEUE_DOWN = -5 | | So the call order on DOWN_PREPARE is: | | CB 1 | CB ... | CB workqueue_up() -> Ignores DOWN_PREPARE | CB ... | CB X ---> Fails | | So we call up to CB X with DOWN_FAILED | | CB 1 | CB ... | CB workqueue_up() -> Handles DOWN_FAILED | CB ... | CB X-1 | | So the problem is that the workqueue stuff handles DOWN_FAILED in the up | callback, while it should do it in the down callback. Which is not a good idea | either because it wants to be called early on rollback... | | Brilliant stuff, isn't it? The hotplug rework will solve this problem because | the callbacks become symetric, but for the existing mess, we need some | workaround in the workqueue code. The boot CPU handles housekeeping duty(unbound timers, workqueues, timekeeping, ...) on behalf of full dynticks CPUs. It must remain online when nohz full is enabled. There is a priority set to every notifier_blocks: workqueue_cpu_up > tick_nohz_cpu_down > workqueue_cpu_down So tick_nohz_cpu_down callback failed when down prepare cpu 0, and notifier_blocks behind tick_nohz_cpu_down will not be called any more, which leads to workers are actually not unbound. Then hotplug state machine will fallback to undo and online cpu 0 again. Workers will be rebound unconditionally even if they are not unbound and trigger the warning in this progress. This patch fix it by catching !DISASSOCIATED to avoid rebind bound workers. Cc: Tejun Heo Cc: Lai Jiangshan Cc: Thomas Gleixner Cc: Peter Zijlstra Cc: Frédéric Weisbecker Suggested-by: Lai Jiangshan Signed-off-by: Wanpeng Li Signed-off-by: Greg Kroah-Hartman commit bef800fb18cba0052314ef90c3f278db500e5490 Author: Boris Brezillon Date: Wed May 11 11:00:02 2016 +0200 ARM: dts: at91: sam9x5: Fix the memory range assigned to the PMC commit aab0a4c83ceb344d2327194bf354820e50607af6 upstream. The memory range assigned to the PMC (Power Management Controller) was not including the PMC_PCR register which are used to control peripheral clocks. This was working fine thanks to the page granularity of ioremap(), but started to fail when we switched to syscon/regmap, because regmap is making sure that all accesses are falling into the reserved range. Signed-off-by: Boris Brezillon Reported-by: Richard Genoud Tested-by: Richard Genoud Fixes: 863a81c3be1d ("clk: at91: make use of syscon to share PMC registers in several drivers") Signed-off-by: Nicolas Ferre Signed-off-by: Greg Kroah-Hartman commit 1c1c3e93daeec9989e50614e99697903d32942ed Author: Miklos Szeredi Date: Wed May 11 01:16:37 2016 +0200 vfs: rename: check backing inode being equal commit 9409e22acdfc9153f88d9b1ed2bd2a5b34d2d3ca upstream. If a file is renamed to a hardlink of itself POSIX specifies that rename(2) should do nothing and return success. This condition is checked in vfs_rename(). However it won't detect hard links on overlayfs where these are given separate inodes on the overlayfs layer. Overlayfs itself detects this condition and returns success without doing anything, but then vfs_rename() will proceed as if this was a successful rename (detach_mounts(), d_move()). The correct thing to do is to detect this condition before even calling into overlayfs. This patch does this by calling vfs_select_inode() to get the underlying inodes. Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit ad56dcb2447522f6a165cb9bccff379e96acca8d Author: Miklos Szeredi Date: Wed May 11 01:16:37 2016 +0200 vfs: add vfs_select_inode() helper commit 54d5ca871e72f2bb172ec9323497f01cd5091ec7 upstream. Signed-off-by: Miklos Szeredi Signed-off-by: Greg Kroah-Hartman commit b45e5d3d1a4f0e443fea0e91dc5a5ae1fe217734 Author: Alexander Shishkin Date: Tue May 10 16:18:33 2016 +0300 perf/core: Disable the event on a truncated AUX record commit 9f448cd3cbcec8995935e60b27802ae56aac8cc0 upstream. When the PMU driver reports a truncated AUX record, it effectively means that there is no more usable room in the event's AUX buffer (even though there may still be some room, so that perf_aux_output_begin() doesn't take action). At this point the consumer still has to be woken up and the event has to be disabled, otherwise the event will just keep spinning between perf_aux_output_begin() and perf_aux_output_end() until its context gets unscheduled. Again, for cpu-wide events this means never, so once in this condition, they will be forever losing data. Fix this by disabling the event and waking up the consumer in case of a truncated AUX record. Reported-by: Markus Metzger Signed-off-by: Alexander Shishkin Signed-off-by: Peter Zijlstra (Intel) Cc: Arnaldo Carvalho de Melo Cc: Arnaldo Carvalho de Melo Cc: Borislav Petkov Cc: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Stephane Eranian Cc: Thomas Gleixner Cc: Vince Weaver Cc: vince@deater.net Link: http://lkml.kernel.org/r/1462886313-13660-3-git-send-email-alexander.shishkin@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit b1a9793aac18fb5df398e9e2c48e4d5b89dbfca0 Author: Namhyung Kim Date: Tue May 10 11:26:24 2016 -0300 perf diff: Fix duplicated output column commit e9d848cb65d5f6f7731d12bd1b6d994bfdbcc94f upstream. The commit b97511c5bc94 ("perf tools: Add overhead/overhead_children keys defaults via string") moved initialization of column headers but it missed to check the sort__mode. As 'perf diff' doesn't call perf_hpp__init(), the setup_overhead() also should not be called. Before: # Baseline Delta Children Overhead Shared Object Symbol # ........ ....... ........ ........ ................... ....................... # 28.48% -28.47% 28.48% 28.48% [kernel.vmlinux ] [k] intel_idle 11.51% -11.47% 11.51% 11.51% libxul.so [.] 0x0000000001a360f7 3.49% -3.49% 3.49% 3.49% [kernel.vmlinux] [k] generic_exec_single 2.91% -2.89% 2.91% 2.91% libdbus-1.so.3.8.11 [.] 0x000000000000cdc2 2.86% -2.85% 2.86% 2.86% libxcb.so.1.1.0 [.] 0x000000000000c890 2.44% -2.39% 2.44% 2.44% [kernel.vmlinux] [k] perf_event_aux_ctx After: # Baseline Delta Shared Object Symbol # ........ ....... ................... ....................... # 28.48% -28.47% [kernel.vmlinux] [k] intel_idle 11.51% -11.47% libxul.so [.] 0x0000000001a360f7 3.49% -3.49% [kernel.vmlinux] [k] generic_exec_single 2.91% -2.89% libdbus-1.so.3.8.11 [.] 0x000000000000cdc2 2.86% -2.85% libxcb.so.1.1.0 [.] 0x000000000000c890 2.44% -2.39% [kernel.vmlinux] [k] perf_event_aux_ctx Signed-off-by: Namhyung Kim Signed-off-by: Arnaldo Carvalho de Melo Acked-by: Jiri Olsa Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Fixes: b97511c5bc94 ("perf tools: Add overhead/overhead_children keys defaults via string") Link: http://lkml.kernel.org/r/1462890384-12486-2-git-send-email-acme@kernel.org Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman commit d764263450e64f2dddb07afda47c54238fdc491d Author: Jack Pham Date: Thu Apr 14 23:37:26 2016 -0700 regmap: spmi: Fix regmap_spmi_ext_read in multi-byte case commit dec8e8f6e6504aa3496c0f7cc10c756bb0e10f44 upstream. Specifically for the case of reads that use the Extended Register Read Long command, a multi-byte read operation is broken up into 8-byte chunks. However the call to spmi_ext_register_readl() is incorrectly passing 'val_size', which if greater than 8 will always fail. The argument should instead be 'len'. Fixes: c9afbb05a9ff ("regmap: spmi: support base and extended register spaces") Signed-off-by: Jack Pham Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 7ac50df1487deaa39b2b19d7c205fe49d2ef132b Author: Ludovic Desroches Date: Tue Apr 19 16:03:45 2016 +0200 pinctrl: at91-pio4: fix pull-up/down logic commit 5305a7b7e860bb40ab226bc7d58019416073948a upstream. The default configuration of a pin is often with a value in the pull-up/down field at chip reset. So, even if the internal logic of the controller prevents writing a configuration with pull-up and pull-down at the same time, we must ensure explicitly this condition before writing the register. This was leading to a pull-down condition not taken into account for instance. Signed-off-by: Ludovic Desroches Fixes: 776180848b57 ("pinctrl: introduce driver for Atmel PIO4 controller") Acked-by: Alexandre Belloni Acked-by: Nicolas Ferre Signed-off-by: Linus Walleij Signed-off-by: Greg Kroah-Hartman commit ffcd833fdd15a5d8c420dc2dfdae694ef034813e Author: Ben Hutchings Date: Tue Apr 12 12:58:14 2016 +0100 spi: spi-ti-qspi: Handle truncated frames properly commit 1ff7760ff66b98ef244bf0e5e2bd5310651205ad upstream. We clamp frame_len_words to a maximum of 4096, but do not actually limit the number of words written or read through the DATA registers or the length added to spi_message::actual_length. This results in silent data corruption for commands longer than this maximum. Recalculate the length of each transfer, taking frame_len_words into account. Use this length in qspi_{read,write}_msg(), and to increment spi_message::actual_length. Signed-off-by: Ben Hutchings Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 81a9d92dbfa2d56a68dd128e3bafcca6a54b8e7e Author: Ben Hutchings Date: Tue Apr 12 12:56:25 2016 +0100 spi: spi-ti-qspi: Fix FLEN and WLEN settings if bits_per_word is overridden commit ea1b60fb085839a9544cb3a0069992991beabb7f upstream. Each transfer can specify 8, 16 or 32 bits per word independently of the default for the device being addressed. However, currently we calculate the number of words in the frame assuming that the word size is the device default. If multiple transfers in the same message have differing bits_per_word, we bitwise-or the different values in the WLEN register field. Fix both of these. Also rename 'frame_length' to 'frame_len_words' to make clear that it's not a byte count like spi_message::frame_length. Signed-off-by: Ben Hutchings Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit a0703bba986a25dcc0af82dbd189440905dc6aa4 Author: Jarkko Nikula Date: Tue Apr 26 10:08:26 2016 +0300 spi: pxa2xx: Do not detect number of enabled chip selects on Intel SPT commit 66ec246eb9982e7eb8e15e1fc55f543230310dd0 upstream. Certain Intel Sunrisepoint PCH variants report zero chip selects in SPI capabilities register even they have one per port. Detection in pxa2xx_spi_probe() sets master->num_chipselect to 0 leading to -EINVAL from spi_register_master() where chip select count is validated. Fix this by not using SPI capabilities register on Sunrisepoint. They don't have more than one chip select so use the default value 1 instead of detection. Fixes: 8b136baa5892 ("spi: pxa2xx: Detect number of enabled Intel LPSS SPI chip select signals") Signed-off-by: Jarkko Nikula Signed-off-by: Mark Brown Signed-off-by: Greg Kroah-Hartman commit 8d5b1e9a9d19230c4c7c851a2793920cfd1c8ec7 Author: Takashi Iwai Date: Tue May 10 10:24:02 2016 +0200 ALSA: hda - Fix broken reconfig commit addacd801e1638f41d659cb53b9b73fc14322cb1 upstream. The HD-audio reconfig function got broken in the recent kernels, typically resulting in a failure like: snd_hda_intel 0000:00:1b.0: control 3:0:0:Playback Channel Map:0 is already present This is because of the code restructuring to move the PCM and control instantiation into the codec drive probe, by the commit [bcd96557bd0a: ALSA: hda - Build PCMs and controls at codec driver probe]. Although the commit above removed the calls of snd_hda_codec_build_pcms() and *_build_controls() at the controller driver probe, the similar calls in the reconfig were still left forgotten. This caused the conflicting and duplicated PCMs and controls. The fix is trivial: just remove these superfluous calls from reconfig_codec(). Fixes: bcd96557bd0a ('ALSA: hda - Build PCMs and controls at codec driver probe') Reported-by: Jochen Henneberg Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 227dfa13aaca85a3b236f02eb203ed5f00f07189 Author: Kaho Ng Date: Mon May 9 00:27:49 2016 +0800 ALSA: hda - Fix white noise on Asus UX501VW headset commit 2da2dc9ead232f25601404335cca13c0f722d41b upstream. For reducing the noise from the headset output on ASUS UX501VW, call the existing fixup, alc_fixup_headset_mode_alc668(), additionally. Thread: https://bbs.archlinux.org/viewtopic.php?id=209554 Signed-off-by: Kaho Ng Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit f36adc590eb3e9a4fd6c61a1cb9f6b273dd58138 Author: Yura Pakhuchiy Date: Sat May 7 23:53:36 2016 +0700 ALSA: hda - Fix subwoofer pin on ASUS N751 and N551 commit 3231e2053eaeee70bdfb216a78a30f11e88e2243 upstream. Subwoofer does not work out of the box on ASUS N751/N551 laptops. This patch fixes it. Patch tested on N751 laptop. N551 part is not tested, but according to [1] and [2] this laptop requires similar changes, so I included them in the patch. 1. https://github.com/honsiorovskyi/asus-n551-hda-fix 2. https://bugs.launchpad.net/ubuntu/+source/alsa-tools/+bug/1405691 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=117781 Signed-off-by: Yura Pakhuchiy Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit b2cd322d3db7c2181ebe310a8e8f6e5eb3071293 Author: Takashi Iwai Date: Wed May 11 17:48:00 2016 +0200 ALSA: usb-audio: Yet another Phoneix Audio device quirk commit 84add303ef950b8d85f54bc2248c2bc73467c329 upstream. Phoenix Audio has yet another device with another id (even a different vendor id, 0556:0014) that requires the same quirk for the sample rate. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=110221 Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 1024e85d016617e8f69515d190af127a46be10f1 Author: Takashi Iwai Date: Fri Apr 29 11:20:15 2016 +0200 ALSA: usb-audio: Quirk for yet another Phoenix Audio devices (v2) commit 2d2c038a9999f423e820d89db2b5d7774b67ba49 upstream. Phoenix Audio MT202pcs (1de7:0114) and MT202exe (1de7:0013) need the same workaround as TMX320 for avoiding the firmware bug. It fixes the frequent error about the sample rate inquiries and the slow device probe as consequence. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=117321 Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman commit 04706e53238a1b4b5bc07cac93265ca2a0992cb8 Author: Herbert Xu Date: Thu May 5 16:42:49 2016 +0800 crypto: testmgr - Use kmalloc memory for RSA input commit df27b26f04ed388ff4cc2b5d8cfdb5d97678816f upstream. As akcipher uses an SG interface, you must not use vmalloc memory as input for it. This patch fixes testmgr to copy the vmalloc test vectors to kmalloc memory before running the test. This patch also removes a superfluous sg_virt call in do_test_rsa. Reported-by: Anatoly Pugachev Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit afd96c1fffcd9040b446213a604d43b24734d0ad Author: Herbert Xu Date: Wed May 4 17:52:56 2016 +0800 crypto: hash - Fix page length clamping in hash walk commit 13f4bb78cf6a312bbdec367ba3da044b09bf0e29 upstream. The crypto hash walk code is broken when supplied with an offset greater than or equal to PAGE_SIZE. This patch fixes it by adjusting walk->pg and walk->offset when this happens. Reported-by: Steffen Klassert Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 0ea84f612750521608c690615813f846b13706d6 Author: Tadeusz Struk Date: Fri Apr 29 10:43:40 2016 -0700 crypto: qat - fix adf_ctl_drv.c:undefined reference to adf_init_pf_wq commit 6dc5df71ee5c8b44607928bfe27be50314dcf848 upstream. Fix undefined reference issue reported by kbuild test robot. Reported-by: kbuild test robot Signed-off-by: Tadeusz Struk Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit 2b30856211abd0a02bc2fc336a7c06ba279d107b Author: Tadeusz Struk Date: Mon Apr 25 07:32:19 2016 -0700 crypto: qat - fix invalid pf2vf_resp_wq logic commit 9e209fcfb804da262e38e5cd2e680c47a41f0f95 upstream. The pf2vf_resp_wq is a global so it has to be created at init and destroyed at exit, instead of per device. Tested-by: Suresh Marikkannu Signed-off-by: Tadeusz Struk Signed-off-by: Herbert Xu Signed-off-by: Greg Kroah-Hartman commit f1b9cba08781033c7641e54d1589b3d43d51d443 Author: Andrea Arcangeli Date: Thu May 12 15:42:25 2016 -0700 mm: thp: calculate the mapcount correctly for THP pages during WP faults commit 6d0a07edd17cfc12fdc1f36de8072fa17cc3666f upstream. This will provide fully accuracy to the mapcount calculation in the write protect faults, so page pinning will not get broken by false positive copy-on-writes. total_mapcount() isn't the right calculation needed in reuse_swap_page(), so this introduces a page_trans_huge_mapcount() that is effectively the full accurate return value for page_mapcount() if dealing with Transparent Hugepages, however we only use the page_trans_huge_mapcount() during COW faults where it strictly needed, due to its higher runtime cost. This also provide at practical zero cost the total_mapcount information which is needed to know if we can still relocate the page anon_vma to the local vma. If page_trans_huge_mapcount() returns 1 we can reuse the page no matter if it's a pte or a pmd_trans_huge triggering the fault, but we can only relocate the page anon_vma to the local vma->anon_vma if we're sure it's only this "vma" mapping the whole THP physical range. Kirill A. Shutemov discovered the problem with moving the page anon_vma to the local vma->anon_vma in a previous version of this patch and another problem in the way page_move_anon_rmap() was called. Andrew Morton discovered that CONFIG_SWAP=n wouldn't build in a previous version, because reuse_swap_page must be a macro to call page_trans_huge_mapcount from swap.h, so this uses a macro again instead of an inline function. With this change at least it's a less dangerous usage than it was before, because "page" is used only once now, while with the previous code reuse_swap_page(page++) would have called page_mapcount on page+1 and it would have increased page twice instead of just once. Dean Luick noticed an uninitialized variable that could result in a rmap inefficiency for the non-THP case in a previous version. Mike Marciniszyn said: : Our RDMA tests are seeing an issue with memory locking that bisects to : commit 61f5d698cc97 ("mm: re-enable THP") : : The test program registers two rather large MRs (512M) and RDMA : writes data to a passive peer using the first and RDMA reads it back : into the second MR and compares that data. The sizes are chosen randomly : between 0 and 1024 bytes. : : The test will get through a few (<= 4 iterations) and then gets a : compare error. : : Tracing indicates the kernel logical addresses associated with the individual : pages at registration ARE correct , the data in the "RDMA read response only" : packets ARE correct. : : The "corruption" occurs when the packet crosse two pages that are not physically : contiguous. The second page reads back as zero in the program. : : It looks like the user VA at the point of the compare error no longer points to : the same physical address as was registered. : : This patch totally resolves the issue! Link: http://lkml.kernel.org/r/1462547040-1737-2-git-send-email-aarcange@redhat.com Signed-off-by: Andrea Arcangeli Reviewed-by: "Kirill A. Shutemov" Reviewed-by: Dean Luick Tested-by: Alex Williamson Tested-by: Mike Marciniszyn Tested-by: Josh Collier Cc: Marc Haber Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 19654c855d606ffb3e9e6c9d8d0a093e1cd4adae Author: Sergey Senozhatsky Date: Mon May 9 16:28:49 2016 -0700 zsmalloc: fix zs_can_compact() integer overflow commit 44f43e99fe70833058482d183e99fdfd11220996 upstream. zs_can_compact() has two race conditions in its core calculation: unsigned long obj_wasted = zs_stat_get(class, OBJ_ALLOCATED) - zs_stat_get(class, OBJ_USED); 1) classes are not locked, so the numbers of allocated and used objects can change by the concurrent ops happening on other CPUs 2) shrinker invokes it from preemptible context Depending on the circumstances, thus, OBJ_ALLOCATED can become less than OBJ_USED, which can result in either very high or negative `total_scan' value calculated later in do_shrink_slab(). do_shrink_slab() has some logic to prevent those cases: vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-64 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 vmscan: shrink_slab: zs_shrinker_scan+0x0/0x28 [zsmalloc] negative objects to delete nr=-62 However, due to the way `total_scan' is calculated, not every shrinker->count_objects() overflow can be spotted and handled. To demonstrate the latter, I added some debugging code to do_shrink_slab() (x86_64) and the results were: vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615] vmscan: but total_scan > 0: 92679974445502 vmscan: resulting total_scan: 92679974445502 [..] vmscan: OVERFLOW: shrinker->count_objects() == -1 [18446744073709551615] vmscan: but total_scan > 0: 22634041808232578 vmscan: resulting total_scan: 22634041808232578 Even though shrinker->count_objects() has returned an overflowed value, the resulting `total_scan' is positive, and, what is more worrisome, it is insanely huge. This value is getting used later on in shrinker->scan_objects() loop: while (total_scan >= batch_size || total_scan >= freeable) { unsigned long ret; unsigned long nr_to_scan = min(batch_size, total_scan); shrinkctl->nr_to_scan = nr_to_scan; ret = shrinker->scan_objects(shrinker, shrinkctl); if (ret == SHRINK_STOP) break; freed += ret; count_vm_events(SLABS_SCANNED, nr_to_scan); total_scan -= nr_to_scan; cond_resched(); } `total_scan >= batch_size' is true for a very-very long time and 'total_scan >= freeable' is also true for quite some time, because `freeable < 0' and `total_scan' is large enough, for example, 22634041808232578. The only break condition, in the given scheme of things, is shrinker->scan_objects() == SHRINK_STOP test, which is a bit too weak to rely on, especially in heavy zsmalloc-usage scenarios. To fix the issue, take a pool stat snapshot and use it instead of racy zs_stat_get() calls. Link: http://lkml.kernel.org/r/20160509140052.3389-1-sergey.senozhatsky@gmail.com Signed-off-by: Sergey Senozhatsky Cc: Minchan Kim Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 23d3f15feaaba0cc20ac549f3aae796c95120326 Author: Junxiao Bi Date: Thu May 12 15:42:18 2016 -0700 ocfs2: fix posix_acl_create deadlock commit c25a1e0671fbca7b2c0d0757d533bd2650d6dc0c upstream. Commit 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure") refactored code to use posix_acl_create. The problem with this function is that it is not mindful of the cluster wide inode lock making it unsuitable for use with ocfs2 inode creation with ACLs. For example, when used in ocfs2_mknod, this function can cause deadlock as follows. The parent dir inode lock is taken when calling posix_acl_create -> get_acl -> ocfs2_iop_get_acl which takes the inode lock again. This can cause deadlock if there is a blocked remote lock request waiting for the lock to be downconverted. And same deadlock happened in ocfs2_reflink. This fix is to revert back using ocfs2_init_acl. Fixes: 702e5bc68ad2 ("ocfs2: use generic posix ACL infrastructure") Signed-off-by: Tariq Saeed Signed-off-by: Junxiao Bi Cc: Mark Fasheh Cc: Joel Becker Cc: Joseph Qi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 0c4998e34bdd567bca20f857015c595839457a2e Author: Junxiao Bi Date: Thu May 12 15:42:15 2016 -0700 ocfs2: revert using ocfs2_acl_chmod to avoid inode cluster lock hang commit 5ee0fbd50fdf1c1329de8bee35ea9d7c6a81a2e0 upstream. Commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") introduced this issue. ocfs2_setattr called by chmod command holds cluster wide inode lock when calling posix_acl_chmod. This latter function in turn calls ocfs2_iop_get_acl and ocfs2_iop_set_acl. These two are also called directly from vfs layer for getfacl/setfacl commands and therefore acquire the cluster wide inode lock. If a remote conversion request comes after the first inode lock in ocfs2_setattr, OCFS2_LOCK_BLOCKED will be set. And this will cause the second call to inode lock from the ocfs2_iop_get_acl() to block indefinetly. The deleted version of ocfs2_acl_chmod() calls __posix_acl_chmod() which does not call back into the filesystem. Therefore, we restore ocfs2_acl_chmod(), modify it slightly for locking as needed, and use that instead. Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Tariq Saeed Signed-off-by: Junxiao Bi Cc: Mark Fasheh Cc: Joel Becker Cc: Joseph Qi Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Greg Kroah-Hartman commit 8be13acc8709f013813da8ae36c6f0e538cbcc0f Author: Paolo Abeni Date: Fri May 13 18:33:41 2016 +0200 net/route: enforce hoplimit max value [ Upstream commit 626abd59e51d4d8c6367e03aae252a8aa759ac78 ] Currently, when creating or updating a route, no check is performed in both ipv4 and ipv6 code to the hoplimit value. The caller can i.e. set hoplimit to 256, and when such route will be used, packets will be sent with hoplimit/ttl equal to 0. This commit adds checks for the RTAX_HOPLIMIT value, in both ipv4 ipv6 route code, substituting any value greater than 255 with 255. This is consistent with what is currently done for ADVMSS and MTU in the ipv4 code. Signed-off-by: Paolo Abeni Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 057f85ef44965346eb3bd5e8978dd7297b386ee7 Author: Eric Dumazet Date: Mon May 9 20:55:16 2016 -0700 tcp: refresh skb timestamp at retransmit time [ Upstream commit 10a81980fc47e64ffac26a073139813d3f697b64 ] In the very unlikely case __tcp_retransmit_skb() can not use the cloning done in tcp_transmit_skb(), we need to refresh skb_mstamp before doing the copy and transmit, otherwise TCP TS val will be an exact copy of original transmit. Fixes: 7faee5c0d514 ("tcp: remove TCP_SKB_CB(skb)->when") Signed-off-by: Eric Dumazet Cc: Yuchung Cheng Acked-by: Yuchung Cheng Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 56633a532c9b3aa41ef8aca951d37ebdd5997a1a Author: xypron.glpk@gmx.de Date: Mon May 9 00:46:18 2016 +0200 net: thunderx: avoid exposing kernel stack [ Upstream commit 161de2caf68c549c266e571ffba8e2163886fb10 ] Reserved fields should be set to zero to avoid exposing bits from the kernel stack. Signed-off-by: Heinrich Schuchardt Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit f7ee286fab0b55bf5908978c94e50d52e627b3ac Author: Kangjie Lu Date: Sun May 8 12:10:14 2016 -0400 net: fix a kernel infoleak in x25 module [ Upstream commit 79e48650320e6fba48369fccf13fd045315b19b8 ] Stack object "dte_facilities" is allocated in x25_rx_call_request(), which is supposed to be initialized in x25_negotiate_facilities. However, 5 fields (8 bytes in total) are not initialized. This object is then copied to userland via copy_to_user, thus infoleak occurs. Signed-off-by: Kangjie Lu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 05b902a09b329bbe0b330332efe4c46ee0dbc83f Author: Mikko Rapeli Date: Sun Apr 24 17:45:00 2016 +0200 uapi glibc compat: fix compile errors when glibc net/if.h included before linux/if.h [ Upstream commit 4a91cb61bb995e5571098188092e296192309c77 ] glibc's net/if.h contains copies of definitions from linux/if.h and these conflict and cause build failures if both files are included by application source code. Changes in uapi headers, which fixed header file dependencies to include linux/if.h when it was needed, e.g. commit 1ffad83d, made the net/if.h and linux/if.h incompatibilities visible as build failures for userspace applications like iproute2 and xtables-addons. This patch fixes compile errors when glibc net/if.h is included before linux/if.h: ./linux/if.h:99:21: error: redeclaration of enumerator ‘IFF_NOARP’ ./linux/if.h:98:23: error: redeclaration of enumerator ‘IFF_RUNNING’ ./linux/if.h:97:26: error: redeclaration of enumerator ‘IFF_NOTRAILERS’ ./linux/if.h:96:27: error: redeclaration of enumerator ‘IFF_POINTOPOINT’ ./linux/if.h:95:24: error: redeclaration of enumerator ‘IFF_LOOPBACK’ ./linux/if.h:94:21: error: redeclaration of enumerator ‘IFF_DEBUG’ ./linux/if.h:93:25: error: redeclaration of enumerator ‘IFF_BROADCAST’ ./linux/if.h:92:19: error: redeclaration of enumerator ‘IFF_UP’ ./linux/if.h:252:8: error: redefinition of ‘struct ifconf’ ./linux/if.h:203:8: error: redefinition of ‘struct ifreq’ ./linux/if.h:169:8: error: redefinition of ‘struct ifmap’ ./linux/if.h:107:23: error: redeclaration of enumerator ‘IFF_DYNAMIC’ ./linux/if.h:106:25: error: redeclaration of enumerator ‘IFF_AUTOMEDIA’ ./linux/if.h:105:23: error: redeclaration of enumerator ‘IFF_PORTSEL’ ./linux/if.h:104:25: error: redeclaration of enumerator ‘IFF_MULTICAST’ ./linux/if.h:103:21: error: redeclaration of enumerator ‘IFF_SLAVE’ ./linux/if.h:102:22: error: redeclaration of enumerator ‘IFF_MASTER’ ./linux/if.h:101:24: error: redeclaration of enumerator ‘IFF_ALLMULTI’ ./linux/if.h:100:23: error: redeclaration of enumerator ‘IFF_PROMISC’ The cases where linux/if.h is included before net/if.h need a similar fix in the glibc side, or the order of include files can be changed userspace code as a workaround. This change was tested in x86 userspace on Debian unstable with scripts/headers_compile_test.sh: $ make headers_install && \ cd usr/include && ../../scripts/headers_compile_test.sh -l -k ... cc -Wall -c -nostdinc -I /usr/lib/gcc/i586-linux-gnu/5/include -I /usr/lib/gcc/i586-linux-gnu/5/include-fixed -I . -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH -I /home/mcfrisk/src/linux-2.6/usr/headers_compile_test_include.2uX2zH/i586-linux-gnu -o /dev/null ./linux/if.h_libc_before_kernel.h PASSED libc before kernel test: ./linux/if.h Reported-by: Jan Engelhardt Reported-by: Josh Boyer Reported-by: Stephen Hemminger Reported-by: Waldemar Brodkorb Cc: Gabriel Laskar Signed-off-by: Mikko Rapeli Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 14ece2e866a334f51e59a3d0344466e497e40086 Author: Linus Lüssing Date: Wed May 4 17:25:02 2016 +0200 bridge: fix igmp / mld query parsing [ Upstream commit 856ce5d083e14571d051301fe3c65b32b8cbe321 ] With the newly introduced helper functions the skb pulling is hidden in the checksumming function - and undone before returning to the caller. The IGMP and MLD query parsing functions in the bridge still assumed that the skb is pointing to the beginning of the IGMP/MLD message while it is now kept at the beginning of the IPv4/6 header. If there is a querier somewhere else, then this either causes the multicast snooping to stay disabled even though it could be enabled. Or, if we have the querier enabled too, then this can create unnecessary IGMP / MLD query messages on the link. Fixing this by taking the offset between IP and IGMP/MLD header into account, too. Fixes: 9afd85c9e455 ("net: Export IGMP/MLD message validation code") Reported-by: Simon Wunderlich Signed-off-by: Linus Lüssing Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 6542eb9ef70183c6a299782f52728ca99ee1d9ab Author: Nikolay Aleksandrov Date: Wed May 4 16:18:45 2016 +0200 net: bridge: fix old ioctl unlocked net device walk [ Upstream commit 31ca0458a61a502adb7ed192bf9716c6d05791a5 ] get_bridge_ifindices() is used from the old "deviceless" bridge ioctl calls which aren't called with rtnl held. The comment above says that it is called with rtnl but that is not really the case. Here's a sample output from a test ASSERT_RTNL() which I put in get_bridge_ifindices and executed "brctl show": [ 957.422726] RTNL: assertion failed at net/bridge//br_ioctl.c (30) [ 957.422925] CPU: 0 PID: 1862 Comm: brctl Tainted: G W O 4.6.0-rc4+ #157 [ 957.423009] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014 [ 957.423009] 0000000000000000 ffff880058adfdf0 ffffffff8138dec5 0000000000000400 [ 957.423009] ffffffff81ce8380 ffff880058adfe58 ffffffffa05ead32 0000000000000001 [ 957.423009] 00007ffec1a444b0 0000000000000400 ffff880053c19130 0000000000008940 [ 957.423009] Call Trace: [ 957.423009] [] dump_stack+0x85/0xc0 [ 957.423009] [] br_ioctl_deviceless_stub+0x212/0x2e0 [bridge] [ 957.423009] [] sock_ioctl+0x22b/0x290 [ 957.423009] [] do_vfs_ioctl+0x95/0x700 [ 957.423009] [] SyS_ioctl+0x79/0x90 [ 957.423009] [] entry_SYSCALL_64_fastpath+0x23/0xc1 Since it only reads bridge ifindices, we can use rcu to safely walk the net device list. Also remove the wrong rtnl comment above. Signed-off-by: Nikolay Aleksandrov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit da6b6f3e3ba5980f00f1e24362802a4fe618a279 Author: Ian Campbell Date: Wed May 4 14:21:53 2016 +0100 VSOCK: do not disconnect socket when peer has shutdown SEND only [ Upstream commit dedc58e067d8c379a15a8a183c5db318201295bb ] The peer may be expecting a reply having sent a request and then done a shutdown(SHUT_WR), so tearing down the whole socket at this point seems wrong and breaks for me with a client which does a SHUT_WR. Looking at other socket family's stream_recvmsg callbacks doing a shutdown here does not seem to be the norm and removing it does not seem to have had any adverse effects that I can see. I'm using Stefan's RFC virtio transport patches, I'm unsure of the impact on the vmci transport. Signed-off-by: Ian Campbell Cc: "David S. Miller" Cc: Stefan Hajnoczi Cc: Claudio Imbrenda Cc: Andy King Cc: Dmitry Torokhov Cc: Jorgen Hansen Cc: Adit Ranadive Cc: netdev@vger.kernel.org Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit eaf9dc35e458ce057d15137d5faf6147b1ac4504 Author: Daniel Jurgens Date: Wed May 4 15:00:33 2016 +0300 net/mlx4_en: Fix endianness bug in IPV6 csum calculation [ Upstream commit 82d69203df634b4dfa765c94f60ce9482bcc44d6 ] Use htons instead of unconditionally byte swapping nexthdr. On a little endian systems shifting the byte is correct behavior, but it results in incorrect csums on big endian architectures. Fixes: f8c6455bb04b ('net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE') Signed-off-by: Daniel Jurgens Reviewed-by: Carol Soto Tested-by: Carol Soto Signed-off-by: Tariq Toukan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ff82293b226fd3bbfbd6d3fcbb0ffbbd55c85862 Author: Kangjie Lu Date: Tue May 3 16:46:24 2016 -0400 net: fix infoleak in rtnetlink [ Upstream commit 5f8e44741f9f216e33736ea4ec65ca9ac03036e6 ] The stack object “map” has a total size of 32 bytes. Its last 4 bytes are padding generated by compiler. These padding bytes are not initialized and sent out via “nla_put”. Signed-off-by: Kangjie Lu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 33d437ee77122c4889d1e9c7ff6488f04b9cf05e Author: Kangjie Lu Date: Tue May 3 16:35:05 2016 -0400 net: fix infoleak in llc [ Upstream commit b8670c09f37bdf2847cc44f36511a53afc6161fd ] The stack object “info” has a total size of 12 bytes. Its last byte is padding which is not initialized and leaked via “put_cmsg”. Signed-off-by: Kangjie Lu Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 926b7f45d2763578e2a7401f5ee90719be74059c Author: Uwe Kleine-König Date: Tue May 3 16:38:53 2016 +0200 net: fec: only clear a queue's work bit if the queue was emptied [ Upstream commit 1c021bb717a70aaeaa4b25c91f43c2aeddd922de ] In the receive path a queue's work bit was cleared unconditionally even if fec_enet_rx_queue only read out a part of the available packets from the hardware. This resulted in not reading any packets in the next napi turn and so packets were delayed or lost. The obvious fix is to only clear a queue's bit when the queue was emptied. Fixes: 4d494cdc92b3 ("net: fec: change data structure to support multiqueue") Signed-off-by: Uwe Kleine-König Reviewed-by: Lucas Stach Tested-by: Fugang Duan Acked-by: Fugang Duan Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 54aa3889c020e7857babc6c72e615d2bc32fc6fa Author: Nicolas Dichtel Date: Tue May 3 09:58:27 2016 +0200 ipv6/ila: fix nlsize calculation for lwtunnel [ Upstream commit 79e8dc8b80bff0bc5bbb90ca5e73044bf207c8ac ] The handler 'ila_fill_encap_info' adds one attribute: ILA_ATTR_LOCATOR. Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module") CC: Tom Herbert Signed-off-by: Nicolas Dichtel Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 97a77feecd4669e060cdc8d6f815c9929507d5c9 Author: Neil Horman Date: Mon May 2 12:20:15 2016 -0400 netem: Segment GSO packets on enqueue [ Upstream commit 6071bd1aa13ed9e41824bafad845b7b7f4df5cfd ] This was recently reported to me, and reproduced on the latest net kernel, when attempting to run netperf from a host that had a netem qdisc attached to the egress interface: [ 788.073771] ---------------------[ cut here ]--------------------------- [ 788.096716] WARNING: at net/core/dev.c:2253 skb_warn_bad_offload+0xcd/0xda() [ 788.129521] bnx2: caps=(0x00000001801949b3, 0x0000000000000000) len=2962 data_len=0 gso_size=1448 gso_type=1 ip_summed=3 [ 788.182150] Modules linked in: sch_netem kvm_amd kvm crc32_pclmul ipmi_ssif ghash_clmulni_intel sp5100_tco amd64_edac_mod aesni_intel lrw gf128mul glue_helper ablk_helper edac_mce_amd cryptd pcspkr sg edac_core hpilo ipmi_si i2c_piix4 k10temp fam15h_power hpwdt ipmi_msghandler shpchp acpi_power_meter pcc_cpufreq nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 syscopyarea sysfillrect sysimgblt i2c_algo_bit drm_kms_helper ahci ata_generic pata_acpi ttm libahci crct10dif_pclmul pata_atiixp tg3 libata crct10dif_common drm crc32c_intel ptp serio_raw bnx2 r8169 hpsa pps_core i2c_core mii dm_mirror dm_region_hash dm_log dm_mod [ 788.465294] CPU: 16 PID: 0 Comm: swapper/16 Tainted: G W ------------ 3.10.0-327.el7.x86_64 #1 [ 788.511521] Hardware name: HP ProLiant DL385p Gen8, BIOS A28 12/17/2012 [ 788.542260] ffff880437c036b8 f7afc56532a53db9 ffff880437c03670 ffffffff816351f1 [ 788.576332] ffff880437c036a8 ffffffff8107b200 ffff880633e74200 ffff880231674000 [ 788.611943] 0000000000000001 0000000000000003 0000000000000000 ffff880437c03710 [ 788.647241] Call Trace: [ 788.658817] [] dump_stack+0x19/0x1b [ 788.686193] [] warn_slowpath_common+0x70/0xb0 [ 788.713803] [] warn_slowpath_fmt+0x5c/0x80 [ 788.741314] [] ? ___ratelimit+0x93/0x100 [ 788.767018] [] skb_warn_bad_offload+0xcd/0xda [ 788.796117] [] skb_checksum_help+0x17c/0x190 [ 788.823392] [] netem_enqueue+0x741/0x7c0 [sch_netem] [ 788.854487] [] dev_queue_xmit+0x2a8/0x570 [ 788.880870] [] ip_finish_output+0x53d/0x7d0 ... The problem occurs because netem is not prepared to handle GSO packets (as it uses skb_checksum_help in its enqueue path, which cannot manipulate these frames). The solution I think is to simply segment the skb in a simmilar fashion to the way we do in __dev_queue_xmit (via validate_xmit_skb), with some minor changes. When we decide to corrupt an skb, if the frame is GSO, we segment it, corrupt the first segment, and enqueue the remaining ones. tested successfully by myself on the latest net kernel, to which this applies Signed-off-by: Neil Horman CC: Jamal Hadi Salim CC: "David S. Miller" CC: netem@lists.linux-foundation.org CC: eric.dumazet@gmail.com CC: stephen@networkplumber.org Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 3d020622f6c4445bb4fd74de4b40a69ae7dddf69 Author: WANG Cong Date: Thu Feb 25 14:55:03 2016 -0800 sch_dsmark: update backlog as well [ Upstream commit bdf17661f63a79c3cb4209b970b1cc39e34f7543 ] Similarly, we need to update backlog too when we update qlen. Cc: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 981f02b55d11910cf65d9e4963d0993aeb7ba1ec Author: WANG Cong Date: Thu Feb 25 14:55:02 2016 -0800 sch_htb: update backlog as well [ Upstream commit 431e3a8e36a05a37126f34b41aa3a5a6456af04e ] We saw qlen!=0 but backlog==0 on our production machine: qdisc htb 1: dev eth0 root refcnt 2 r2q 10 default 1 direct_packets_stat 0 ver 3.17 Sent 172680457356 bytes 222469449 pkt (dropped 0, overlimits 123575834 requeues 0) backlog 0b 72p requeues 0 The problem is we only count qlen for HTB qdisc but not backlog. We need to update backlog too when we update qlen, so that we can at least know the average packet length. Cc: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 01465faa0e2d311512690724196042f9bb466034 Author: WANG Cong Date: Thu Feb 25 14:55:01 2016 -0800 net_sched: update hierarchical backlog too [ Upstream commit 2ccccf5fb43ff62b2b96cc58d95fc0b3596516e4 ] When the bottom qdisc decides to, for example, drop some packet, it calls qdisc_tree_decrease_qlen() to update the queue length for all its ancestors, we need to update the backlog too to keep the stats on root qdisc accurate. Cc: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2da507a87181a33b490ca4daba2c994cbaf7055f Author: WANG Cong Date: Thu Feb 25 14:55:00 2016 -0800 net_sched: introduce qdisc_replace() helper [ Upstream commit 86a7996cc8a078793670d82ed97d5a99bb4e8496 ] Remove nearly duplicated code and prepare for the following patch. Cc: Jamal Hadi Salim Acked-by: Jamal Hadi Salim Signed-off-by: Cong Wang Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 862ad6c9c9b4496b91384624fee6af74c6420d6d Author: Jiri Benc Date: Fri Apr 29 23:31:32 2016 +0200 gre: do not pull header in ICMP error processing [ Upstream commit b7f8fe251e4609e2a437bd2c2dea01e61db6849c ] iptunnel_pull_header expects that IP header was already pulled; with this expectation, it pulls the tunnel header. This is not true in gre_err. Furthermore, ipv4_update_pmtu and ipv4_redirect expect that skb->data points to the IP header. We cannot pull the tunnel header in this path. It's just a matter of not calling iptunnel_pull_header - we don't need any of its effects. Fixes: bda7bb463436 ("gre: Allow multiple protocol listener for gre protocol.") Signed-off-by: Jiri Benc Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a2ea9029cafdd2e03d2e642b2d4867098088fb80 Author: Tim Bingham Date: Fri Apr 29 13:30:23 2016 -0400 net: Implement net_dbg_ratelimited() for CONFIG_DYNAMIC_DEBUG case [ Upstream commit 2c94b53738549d81dc7464a32117d1f5112c64d3 ] Prior to commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") the implementation of net_dbg_ratelimited() was buggy for both the DEBUG and CONFIG_DYNAMIC_DEBUG cases. The bug was that net_ratelimit() was being called and, despite returning true, nothing was being printed to the console. This resulted in messages like the following - "net_ratelimit: %d callbacks suppressed" with no other output nearby. After commit d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") the bug is fixed for the DEBUG case. However, there's no output at all for CONFIG_DYNAMIC_DEBUG case. This patch restores debug output (if enabled) for the CONFIG_DYNAMIC_DEBUG case. Add a definition of net_dbg_ratelimited() for the CONFIG_DYNAMIC_DEBUG case. The implementation takes care to check that dynamic debugging is enabled before calling net_ratelimit(). Fixes: d92cff89a0c8 ("net_dbg_ratelimited: turn into no-op when !DEBUG") Signed-off-by: Tim Bingham Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 59d0a01cc7af0c8556c64812a3429dc1e96cd006 Author: Alexei Starovoitov Date: Wed Apr 27 18:56:22 2016 -0700 samples/bpf: fix trace_output example [ Upstream commit 569cc39d39385a74b23145496bca2df5ac8b2fb8 ] llvm cannot always recognize memset as builtin function and optimize it away, so just delete it. It was a leftover from testing of bpf_perf_event_output() with large data structures. Fixes: 39111695b1b8 ("samples: bpf: add bpf_perf_event_output example") Signed-off-by: Alexei Starovoitov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0ae7bad57030e0f1cb22d34ceecd4a3d1bd3b4b0 Author: Alexei Starovoitov Date: Wed Apr 27 18:56:21 2016 -0700 bpf: fix check_map_func_compatibility logic [ Upstream commit 6aff67c85c9e5a4bc99e5211c1bac547936626ca ] The commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter") introduced clever way to check bpf_helper<->map_type compatibility. Later on commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adjusted the logic and inadvertently broke it. Get rid of the clever bool compare and go back to two-way check from map and from helper perspective. Fixes: a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") Reported-by: Jann Horn Signed-off-by: Alexei Starovoitov Signed-off-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 1b106ad23a72bba34c7f37574defa324fdd76fc7 Author: Alexei Starovoitov Date: Wed Apr 27 18:56:20 2016 -0700 bpf: fix refcnt overflow [ Upstream commit 92117d8443bc5afacc8d5ba82e541946310f106e ] On a system with >32Gbyte of phyiscal memory and infinite RLIMIT_MEMLOCK, the malicious application may overflow 32-bit bpf program refcnt. It's also possible to overflow map refcnt on 1Tb system. Impose 32k hard limit which means that the same bpf program or map cannot be shared by more than 32k processes. Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs") Reported-by: Jann Horn Signed-off-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2ffd01aa8d12c83c43b611a74a09852ea4dd0111 Author: Jann Horn Date: Tue Apr 26 22:26:26 2016 +0200 bpf: fix double-fdput in replace_map_fd_with_map_ptr() [ Upstream commit 8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7 ] When bpf(BPF_PROG_LOAD, ...) was invoked with a BPF program whose bytecode references a non-map file descriptor as a map file descriptor, the error handling code called fdput() twice instead of once (in __bpf_map_get() and in replace_map_fd_with_map_ptr()). If the file descriptor table of the current task is shared, this causes f_count to be decremented too much, allowing the struct file to be freed while it is still in use (use-after-free). This can be exploited to gain root privileges by an unprivileged user. This bug was introduced in commit 0246e64d9a5f ("bpf: handle pseudo BPF_LD_IMM64 insn"), but is only exploitable since commit 1be7f75d1668 ("bpf: enable non-root eBPF programs") because previously, CAP_SYS_ADMIN was required to reach the vulnerable code. (posted publicly according to request by maintainer) Signed-off-by: Jann Horn Signed-off-by: Linus Torvalds Acked-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7d57af4f11f18b6aa48e5f083ada67fac917612e Author: Eric Dumazet Date: Sat Apr 23 11:35:46 2016 -0700 net/mlx4_en: fix spurious timestamping callbacks [ Upstream commit fc96256c906362e845d848d0f6a6354450059e81 ] When multiple skb are TX-completed in a row, we might incorrectly keep a timestamp of a prior skb and cause extra work. Fixes: ec693d47010e8 ("net/mlx4_en: Add HW timestamping (TS) support") Signed-off-by: Eric Dumazet Cc: Willem de Bruijn Reviewed-by: Eran Ben Elisha Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 698faa8f415febe9fa6231ab55fb4aa5a78d9b28 Author: Paolo Abeni Date: Thu Apr 21 22:23:31 2016 +0200 ipv4/fib: don't warn when primary address is missing if in_dev is dead [ Upstream commit 391a20333b8393ef2e13014e6e59d192c5594471 ] After commit fbd40ea0180a ("ipv4: Don't do expensive useless work during inetdev destroy.") when deleting an interface, fib_del_ifaddr() can be executed without any primary address present on the dead interface. The above is safe, but triggers some "bug: prim == NULL" warnings. This commit avoids warning if the in_dev is dead Signed-off-by: Paolo Abeni Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 186654f41a99acbedef6f77c673f1a28636b29e4 Author: Saeed Mahameed Date: Fri Apr 22 00:33:05 2016 +0300 net/mlx5e: Use vport MTU rather than physical port MTU [ Upstream commit cd255efff9baadd654d6160e52d17ae7c568c9d3 ] Set and report vport MTU rather than physical MTU, Driver will set both vport and physical port mtu and will rely on the query of vport mtu. SRIOV VFs have to report their MTU to their vport manager (PF), and this will allow them to work with any MTU they need without failing the request. Also for some cases where the PF is not a port owner, PF can work with MTU less than the physical port mtu if set physical port mtu didn't take effect. Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 99d966897d7b50279297aeeb25d6f32b84109879 Author: Saeed Mahameed Date: Fri Apr 22 00:33:04 2016 +0300 net/mlx5e: Fix minimum MTU [ Upstream commit d8edd2469ace550db707798180d1c84d81f93bca ] Minimum MTU that can be set in Connectx4 device is 68. This fixes the case where a user wants to set invalid MTU, the driver will fail to satisfy this request and the interface will stay down. It is better to report an error and continue working with old mtu. Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 88e2d2d2eb82a0f3c96a19868dbe30dac219cd47 Author: Saeed Mahameed Date: Fri Apr 22 00:33:03 2016 +0300 net/mlx5e: Device's mtu field is u16 and not int [ Upstream commit 046339eaab26804f52f6604877f5674f70815b26 ] For set/query MTU port firmware commands the MTU field is 16 bits, here I changed all the "int mtu" parameters of the functions wrapping those firmware commands to be u16. Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2eeed9d5da7c144c9e1dda81094b9cb711333cb2 Author: Maor Gottlieb Date: Fri Apr 22 00:33:00 2016 +0300 net/mlx5_core: Fix soft lockup in steering error flow [ Upstream commit c3f9bf628bc7edda298897d952f5e761137229c9 ] In the error flow of adding flow rule to auto-grouped flow table, we call to tree_remove_node. tree_remove_node locks the node's parent, however the node's parent is already locked by mlx5_add_flow_rule and this causes a deadlock. After this patch, if we failed to add the flow rule, we unlock the flow table before calling to tree_remove_node. fixes: f0d22d187473 ('net/mlx5_core: Introduce flow steering autogrouped flow table') Signed-off-by: Maor Gottlieb Reported-by: Amir Vadai Signed-off-by: Saeed Mahameed Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 0c056cfb4195db32a96a82640dd3f0f6ac48131a Author: Simon Horman Date: Thu Apr 21 11:49:15 2016 +1000 openvswitch: use flow protocol when recalculating ipv6 checksums [ Upstream commit b4f70527f052b0c00be4d7cac562baa75b212df5 ] When using masked actions the ipv6_proto field of an action to set IPv6 fields may be zero rather than the prevailing protocol which will result in skipping checksum recalculation. This patch resolves the problem by relying on the protocol in the flow key rather than that in the set field action. Fixes: 83d2b9ba1abc ("net: openvswitch: Support masked set actions.") Cc: Jarno Rajahalme Signed-off-by: Simon Horman Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 11236cf9aef1402ddc1734409d89d3ad76e5eddc Author: Ben Hutchings Date: Wed Apr 20 23:23:08 2016 +0100 atl2: Disable unimplemented scatter/gather feature [ Upstream commit f43bfaeddc79effbf3d0fcb53ca477cca66f3db8 ] atl2 includes NETIF_F_SG in hw_features even though it has no support for non-linear skbs. This bug was originally harmless since the driver does not claim to implement checksum offload and that used to be a requirement for SG. Now that SG and checksum offload are independent features, if you explicitly enable SG *and* use one of the rare protocols that can use SG without checkusm offload, this potentially leaks sensitive information (before you notice that it just isn't working). Therefore this obscure bug has been designated CVE-2016-2117. Reported-by: Justin Yackoski Signed-off-by: Ben Hutchings Fixes: ec5f06156423 ("net: Kill link between CSUM and SG features.") Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit a8be898cdbd0094677c5c908d37447c51b113c37 Author: Joe Stringer Date: Mon Apr 18 14:51:47 2016 -0700 openvswitch: Orphan skbs before IPv6 defrag [ Upstream commit 49e261a8a21e0960a3f7ff187a453ba1c1149053 ] This is the IPv6 counterpart to commit 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()"). Prior to commit 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone operations"), ipv6 fragments sent to nf_ct_frag6_gather() would be cloned (implicitly orphaning) prior to queueing for reassembly. As such, when the IPv6 message is eventually reassembled, the skb->sk for all fragments would be NULL. After that commit was introduced, rather than cloning, the original skbs were queued directly without orphaning. The end result is that all frags except for the first and last may have a socket attached. This commit explicitly orphans such skbs during nf_ct_frag6_gather() to prevent BUG_ON(skb->sk) during a later call to ip6_fragment(). kernel BUG at net/ipv6/ip6_output.c:631! [...] Call Trace: [] ? __lock_acquire+0x927/0x20a0 [] ? do_output.isra.28+0x1b0/0x1b0 [openvswitch] [] ? __lock_is_held+0x52/0x70 [] ovs_fragment+0x1f7/0x280 [openvswitch] [] ? mark_held_locks+0x75/0xa0 [] ? _raw_spin_unlock_irqrestore+0x36/0x50 [] ? dst_discard_out+0x20/0x20 [] ? dst_ifdown+0x80/0x80 [] do_output.isra.28+0xf3/0x1b0 [openvswitch] [] do_execute_actions+0x709/0x12c0 [openvswitch] [] ? ovs_flow_stats_update+0x74/0x1e0 [openvswitch] [] ? ovs_flow_stats_update+0xa1/0x1e0 [openvswitch] [] ? _raw_spin_unlock+0x27/0x40 [] ovs_execute_actions+0x45/0x120 [openvswitch] [] ovs_dp_process_packet+0x85/0x150 [openvswitch] [] ? _raw_spin_unlock+0x27/0x40 [] ovs_execute_actions+0xc4/0x120 [openvswitch] [] ovs_dp_process_packet+0x85/0x150 [openvswitch] [] ? key_extract+0x442/0xc10 [openvswitch] [] ovs_vport_receive+0x5d/0xb0 [openvswitch] [] ? __lock_acquire+0x927/0x20a0 [] ? __lock_acquire+0x927/0x20a0 [] ? __lock_acquire+0x927/0x20a0 [] ? _raw_spin_unlock_irqrestore+0x36/0x50 [] internal_dev_xmit+0x6d/0x150 [openvswitch] [] ? internal_dev_xmit+0x5/0x150 [openvswitch] [] dev_hard_start_xmit+0x2df/0x660 [] ? validate_xmit_skb.isra.105.part.106+0x1a/0x2b0 [] __dev_queue_xmit+0x8f5/0x950 [] ? __dev_queue_xmit+0x50/0x950 [] ? mark_held_locks+0x75/0xa0 [] dev_queue_xmit+0x10/0x20 [] neigh_resolve_output+0x178/0x220 [] ? ip6_finish_output2+0x219/0x7b0 [] ip6_finish_output2+0x219/0x7b0 [] ? ip6_finish_output2+0x65/0x7b0 [] ? ip_idents_reserve+0x6b/0x80 [] ? ip6_fragment+0x93f/0xc50 [] ip6_fragment+0xba1/0xc50 [] ? ip6_flush_pending_frames+0x40/0x40 [] ip6_finish_output+0xcb/0x1d0 [] ip6_output+0x5f/0x1a0 [] ? ip6_fragment+0xc50/0xc50 [] ip6_local_out+0x3d/0x80 [] ip6_send_skb+0x2f/0xc0 [] ip6_push_pending_frames+0x4d/0x50 [] icmpv6_push_pending_frames+0xac/0xe0 [] icmpv6_echo_reply+0x42e/0x500 [] icmpv6_rcv+0x4cf/0x580 [] ip6_input_finish+0x1a7/0x690 [] ? ip6_input_finish+0x5/0x690 [] ip6_input+0x30/0xa0 [] ? ip6_rcv_finish+0x1a0/0x1a0 [] ip6_rcv_finish+0x4e/0x1a0 [] ipv6_rcv+0x45f/0x7c0 [] ? ipv6_rcv+0x36/0x7c0 [] ? ip6_make_skb+0x1c0/0x1c0 [] __netif_receive_skb_core+0x229/0xb80 [] ? mark_held_locks+0x75/0xa0 [] ? process_backlog+0x6f/0x230 [] __netif_receive_skb+0x16/0x70 [] process_backlog+0x78/0x230 [] ? process_backlog+0xdd/0x230 [] net_rx_action+0x203/0x480 [] ? mark_held_locks+0x75/0xa0 [] __do_softirq+0xde/0x49f [] ? ip6_finish_output2+0x228/0x7b0 [] do_softirq_own_stack+0x1c/0x30 [] do_softirq.part.18+0x3b/0x40 [] __local_bh_enable_ip+0xb6/0xc0 [] ip6_finish_output2+0x251/0x7b0 [] ? ip6_fragment+0xba1/0xc50 [] ? ip_idents_reserve+0x6b/0x80 [] ? ip6_fragment+0x93f/0xc50 [] ip6_fragment+0xba1/0xc50 [] ? ip6_flush_pending_frames+0x40/0x40 [] ip6_finish_output+0xcb/0x1d0 [] ip6_output+0x5f/0x1a0 [] ? ip6_fragment+0xc50/0xc50 [] ip6_local_out+0x3d/0x80 [] ip6_send_skb+0x2f/0xc0 [] ip6_push_pending_frames+0x4d/0x50 [] rawv6_sendmsg+0xa28/0xe30 [] ? inet_sendmsg+0xc7/0x1d0 [] inet_sendmsg+0x106/0x1d0 [] ? inet_sendmsg+0x5/0x1d0 [] sock_sendmsg+0x38/0x50 [] SYSC_sendto+0xf6/0x170 [] ? trace_hardirqs_on_thunk+0x1b/0x1d [] SyS_sendto+0xe/0x10 [] entry_SYSCALL_64_fastpath+0x18/0xa8 Code: 06 48 83 3f 00 75 26 48 8b 87 d8 00 00 00 2b 87 d0 00 00 00 48 39 d0 72 14 8b 87 e4 00 00 00 83 f8 01 75 09 48 83 7f 18 00 74 9a <0f> 0b 41 8b 86 cc 00 00 00 49 8# RIP [] ip6_fragment+0x73a/0xc50 RSP Fixes: 029f7f3b8701 ("netfilter: ipv6: nf_defrag: avoid/free clone operations") Reported-by: Daniele Di Proietto Signed-off-by: Joe Stringer Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 2b1e1a75e30807a9d767f28e85a01acb0a521049 Author: Daniel Borkmann Date: Sat Apr 16 02:27:58 2016 +0200 vlan: pull on __vlan_insert_tag error path and fix csum correction [ Upstream commit 9241e2df4fbc648a92ea0752918e05c26255649e ] When __vlan_insert_tag() fails from skb_vlan_push() path due to the skb_cow_head(), we need to undo the __skb_push() in the error path as well that was done earlier to move skb->data pointer to mac header. Moreover, I noticed that when in the non-error path the __skb_pull() is done and the original offset to mac header was non-zero, we fixup from a wrong skb->data offset in the checksum complete processing. So the skb_postpush_rcsum() really needs to be done before __skb_pull() where skb->data still points to the mac header start and thus operates under the same conditions as in __vlan_insert_tag(). Fixes: 93515d53b133 ("net: move vlan pop/push functions into common code") Signed-off-by: Daniel Borkmann Reviewed-by: Jiri Pirko Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 9a11c27b88032a88b882a2ff6b2265f975145197 Author: Daniel Borkmann Date: Sat Feb 20 00:29:30 2016 +0100 net: use skb_postpush_rcsum instead of own implementations [ Upstream commit 6b83d28a55a891a9d70fc61ccb1c138e47dcbe74 ] Replace individual implementations with the recently introduced skb_postpush_rcsum() helper. Signed-off-by: Daniel Borkmann Acked-by: Tom Herbert Acked-by: Alexei Starovoitov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 5ac8f80af2fb18f1db0afacae3276901e17809ea Author: Craig Gallek Date: Tue Apr 12 13:11:25 2016 -0400 soreuseport: fix ordering for mixed v4/v6 sockets [ Upstream commit d894ba18d4e449b3a7f6eb491f16c9e02933736e ] With the SO_REUSEPORT socket option, it is possible to create sockets in the AF_INET and AF_INET6 domains which are bound to the same IPv4 address. This is only possible with SO_REUSEPORT and when not using IPV6_V6ONLY on the AF_INET6 sockets. Prior to the commits referenced below, an incoming IPv4 packet would always be routed to a socket of type AF_INET when this mixed-mode was used. After those changes, the same packet would be routed to the most recently bound socket (if this happened to be an AF_INET6 socket, it would have an IPv4 mapped IPv6 address). The change in behavior occurred because the recent SO_REUSEPORT optimizations short-circuit the socket scoring logic as soon as they find a match. They did not take into account the scoring logic that favors AF_INET sockets over AF_INET6 sockets in the event of a tie. To fix this problem, this patch changes the insertion order of AF_INET and AF_INET6 addresses in the TCP and UDP socket lists when the sockets have SO_REUSEPORT set. AF_INET sockets will be inserted at the head of the list and AF_INET6 sockets with SO_REUSEPORT set will always be inserted at the tail of the list. This will force AF_INET sockets to always be considered first. Fixes: e32ea7e74727 ("soreuseport: fast reuseport UDP socket selection") Fixes: 125e80b88687 ("soreuseport: fast reuseport TCP socket selection") Reported-by: Maciej Żenczykowski Signed-off-by: Craig Gallek Signed-off-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 62862f1047561038739d9ee09a92b554fc3da6d3 Author: Bjørn Mork Date: Tue Apr 12 16:11:12 2016 +0200 cdc_mbim: apply "NDP to end" quirk to all Huawei devices [ Upstream commit c5b5343cfbc9f46af65033fa4f407d7b7d98371d ] We now have a positive report of another Huawei device needing this quirk: The ME906s-158 (12d1:15c1). This is an m.2 form factor modem with no obvious relationship to the E3372 (12d1:157d) we already have a quirk entry for. This is reason enough to believe the quirk might be necessary for any number of current and future Huawei devices. Applying the quirk to all Huawei devices, since it is crucial to any device affected by the firmware bug, while the impact on non-affected devices is negligible. The quirk can if necessary be disabled per-device by writing N to /sys/class/net//cdc_ncm/ndp_to_end Reported-by: Andreas Fett Signed-off-by: Bjørn Mork Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 682c5a13beec6c33d8b65fa4ae612f1c74033d63 Author: Alexei Starovoitov Date: Tue Apr 12 10:26:19 2016 -0700 bpf/verifier: reject invalid LD_ABS | BPF_DW instruction [ Upstream commit d82bccc69041a51f7b7b9b4a36db0772f4cdba21 ] verifier must check for reserved size bits in instruction opcode and reject BPF_LD | BPF_ABS | BPF_DW and BPF_LD | BPF_IND | BPF_DW instructions, otherwise interpreter will WARN_RATELIMIT on them during execution. Fixes: ddd872bc3098 ("bpf: verifier: add checks for BPF_ABS | BPF_IND instructions") Signed-off-by: Alexei Starovoitov Acked-by: Daniel Borkmann Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit e4bb8baf1c4dabce8e704704c46701051988bd9d Author: Lars Persson Date: Tue Apr 12 08:45:52 2016 +0200 net: sched: do not requeue a NULL skb [ Upstream commit 3dcd493fbebfd631913df6e2773cc295d3bf7d22 ] A failure in validate_xmit_skb_list() triggered an unconditional call to dev_requeue_skb with skb=NULL. This slowly grows the queue discipline's qlen count until all traffic through the queue stops. We take the optimistic approach and continue running the queue after a failure since it is unknown if later packets also will fail in the validate path. Fixes: 55a93b3ea780 ("qdisc: validate skb without holding lock") Signed-off-by: Lars Persson Acked-by: Eric Dumazet Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 555694a8f03fae56a5952436c21defd1b072150b Author: Mathias Krause Date: Sun Apr 10 12:52:28 2016 +0200 packet: fix heap info leak in PACKET_DIAG_MCLIST sock_diag interface [ Upstream commit 309cf37fe2a781279b7675d4bb7173198e532867 ] Because we miss to wipe the remainder of i->addr[] in packet_mc_add(), pdiag_put_mclist() leaks uninitialized heap bytes via the PACKET_DIAG_MCLIST netlink attribute. Fix this by explicitly memset(0)ing the remaining bytes in i->addr[]. Fixes: eea68e2f1a00 ("packet: Report socket mclist info via diag module") Signed-off-by: Mathias Krause Cc: Eric W. Biederman Cc: Pavel Emelyanov Acked-by: Pavel Emelyanov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 992ce0a3636c20b6399fdb6d83cf0c14fff4f815 Author: Chris Friesen Date: Fri Apr 8 15:21:30 2016 -0600 route: do not cache fib route info on local routes with oif [ Upstream commit d6d5e999e5df67f8ec20b6be45e2229455ee3699 ] For local routes that require a particular output interface we do not want to cache the result. Caching the result causes incorrect behaviour when there are multiple source addresses on the interface. The end result being that if the intended recipient is waiting on that interface for the packet he won't receive it because it will be delivered on the loopback interface and the IP_PKTINFO ipi_ifindex will be set to the loopback interface as well. This can be tested by running a program such as "dhcp_release" which attempts to inject a packet on a particular interface so that it is received by another program on the same board. The receiving process should see an IP_PKTINFO ipi_ifndex value of the source interface (e.g., eth1) instead of the loopback interface (e.g., lo). The packet will still appear on the loopback interface in tcpdump but the important aspect is that the CMSG info is correct. Sample dhcp_release command line: dhcp_release eth1 192.168.204.222 02:11:33:22:44:66 Signed-off-by: Allain Legacy Signed off-by: Chris Friesen Reviewed-by: Julian Anastasov Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit 7f4c2607942a70f47ea754c8c9d1a5b3244d2359 Author: David S. Miller Date: Sun Apr 10 23:01:30 2016 -0400 decnet: Do not build routes to devices without decnet private data. [ Upstream commit a36a0d4008488fa545c74445d69eaf56377d5d4e ] In particular, make sure we check for decnet private presence for loopback devices. Signed-off-by: David S. Miller Signed-off-by: Greg Kroah-Hartman commit ec176528312064f611a7d1ac3bbe16675e2ab97e Author: Arnd Bergmann Date: Wed Jan 13 15:36:17 2016 +0100 staging: wilc1000: remove extraneous variable commit ce7b516f3f9e11fe4ee06fad0d7e853bb6e8f160 upstream. Building wilc1000 with clang currently fails in the staging-next branch: drivers/staging/wilc1000/wilc_spi.c:123:34: warning: tentative definition of variable with internal linkage has incomplete non-array type 'const struct wilc1000_ops' [-Wtentative-definition-incomplete-type] static const struct wilc1000_ops wilc1000_spi_ops; The reason is that wilc1000_ops was left behind after a recent cleanup, and is completely unused and also uninitialized and const and has an incomplete type. Removing the variable is obviously correct, and gets rid of the warning. No idea why gcc does not complain about it though. Signed-off-by: Arnd Bergmann Signed-off-by: Greg Kroah-Hartman Signed-off-by: Greg Kroah-Hartman