-
Notifications
You must be signed in to change notification settings - Fork 475
Apache Bench can fill up ipvs service proxy in seconds #544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Check with |
I actually don't see any TIME_WAITS on the physical host, I do see a ton in the containers. I made 6 replicas this time, all still on the same host node6. Below is another picture of where the numbers are at when the requests stop being answered again. It seems the limit is still 14000, just divided among the containers now. Below is the number of TIME-WAITS retrieve from each container. |
Just wanted to point out I am not using DSR, which causes me to wonder why there is an accumulation of TIME_WAIT connections, since in my case shouldn't the LVS be able to see all packets sent both ways? |
TIME_WAIT is in the tcp standard. The state should linger for 2 minutes (depending on how the connection was shutdown) and ipvs also keeps the state to be able to forward stray packets. |
I've done some research on what is going on and it turns out there is a legitimate problem with IPVS. IPVS is not reusing ports like it is supposed to and thus the ephemeral ports are exhausted depending on the ephemeral port range (net.ipv4.ip_local_port_range). Setting net.ipv4.vs.conntrack=0 in sysctl somehow solves the re-use problem, but it breaks nodeport (and probably other stuff) so I don't believe that is the solution. I don't know if it's just CentOS 7 that is affected or this is a broader problem but I imagine many other engineering teams using IPVS as a service proxy are going to eventually encounter this limitation. |
We have been investigating the problem and have come to the following conclusions:
|
Thank you for looking into this. We aren't utilizing BGP or ECMP yet so a load balancer will add all nodes regardless. For example disabling conntrack won't affect a POD hitting a service IP to reach another POD on a different node? And will session affinity like clientip still work? EDIT: EDIT2: setting: appears to solve everything, even node port! |
Special sauce for me seems to be: net.ipv4.vs.conntrack=1 |
Our tests showed that disabling reuse with 'net.ipv4.vs.conn_reuse_mode=0' will interfere with scaling. When adding more pods in a high traffic scenario the traffic will stick to the old and overloaded pods and when scaling down, the traffic will be send to non existent pods. |
Please read this excellent comment on a refered issue; moby/moby#35082 (comment) Be aware that the case where a stream of connects from a single source may not be the common case in real life. It is more likely that you have few connections but from very many sources. You may try to tune your system to handle a case that only exist in your lab. While doing so you tweak parameters that are standard and are there for a reason. The result may be that your app becomes more unstable in real life where the networks is less reliable but performs excellent in your lab which is probably a LAN. |
Have you ever set |
One of the suggestions was to set --notrack on host:
This makes issues with non local pod communication, AFAIK. Also, for reference "one second delay communication" article which explains and provides some of solutions, https://marc.info/?l=linux-virtual-server&m=151743061027765&w=2 |
I have the same confusing feeling that why IPVS drops SYN packet that hits IPVS connection in TIME_WAIT state if such connection uses Netfilter connection tracking (conntrack=1)? |
@neeseius we have set conn_reuse_mode to 0 in the lastest build, could you test if you are experiencing the same problem with |
@roffe I tried your image in our setup and it seems to solve the problem! When running with |
no, that was the only change |
This does appear to the solve the problem in my testing, even when scaling up and down. I know we toyed with these parameters before, but it interfered with scaling. However I noticed this is new: Is that was made the difference? |
|
v0.2.3 released with IPVS throughput fixes |
Don't understand how can the last two be used in the same time when the kernel docs about
so by setting |
And this is the main problem I see with this since setting it to zero basically disables |
must be a typo in the docs, kernel does not check if conn_reuse_mode is 0 when expiring nodest conn it seems: https://github.com/torvalds/linux/blob/master/net/netfilter/ipvs/ip_vs_core.c#L1982 |
Otherwise, will meet issue cloudnativelabs/kube-router#544
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f678ce8cc3cb2ad29df75d8824c74f36398ba871 ] ddebug_describe_flags() currently fills a caller provided string buffer, after testing its size (also passed) in a BUG_ON. Fix this by replacing them with a known-big-enough string buffer wrapped in a struct, and passing that instead. Also simplify ddebug_describe_flags() flags parameter from a struct to a member in that struct, and hoist the member deref up to the caller. This makes the function reusable (soon) where flags are unpacked. Acked-by: <[email protected]> Signed-off-by: Jim Cromie <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> bcache: fix super block seq numbers comparision in register_cache_set() [ Upstream commit 117f636ea695270fe492d0c0c9dfadc7a662af47 ] In register_cache_set(), c is pointer to struct cache_set, and ca is pointer to struct cache, if ca->sb.seq > c->sb.seq, it means this registering cache has up to date version and other members, the in- memory version and other members should be updated to the newer value. But current implementation makes a cache set only has a single cache device, so the above assumption works well except for a special case. The execption is when a cache device new created and both ca->sb.seq and c->sb.seq are 0, because the super block is never flushed out yet. In the location for the following if() check, 2156 if (ca->sb.seq > c->sb.seq) { 2157 c->sb.version = ca->sb.version; 2158 memcpy(c->sb.set_uuid, ca->sb.set_uuid, 16); 2159 c->sb.flags = ca->sb.flags; 2160 c->sb.seq = ca->sb.seq; 2161 pr_debug("set version = %llu\n", c->sb.version); 2162 } c->sb.version is not initialized yet and valued 0. When ca->sb.seq is 0, the if() check will fail (because both values are 0), and the cache set version, set_uuid, flags and seq won't be updated. The above problem is hiden for current code, because the bucket size is compatible among different super block version. And the next time when running cache set again, ca->sb.seq will be larger than 0 and cache set super block version will be updated properly. But if the large bucket feature is enabled, sb->bucket_size is the low 16bits of the bucket size. For a power of 2 value, when the actual bucket size exceeds 16bit width, sb->bucket_size will always be 0. Then read_super_common() will fail because the if() check to is_power_of_2(sb->bucket_size) is false. This is how the long time hidden bug is triggered. This patch modifies the if() check to the following way, 2156 if (ca->sb.seq > c->sb.seq || c->sb.seq == 0) { Then cache set's version, set_uuid, flags and seq will always be updated corectly including for a new created cache device. Signed-off-by: Coly Li <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ACPICA: Do not increment operation_region reference counts for field units [ Upstream commit 6a54ebae6d047c988a31f5ac5a64ab5cf83797a2 ] ACPICA commit e17b28cfcc31918d0db9547b6b274b09c413eb70 Object reference counts are used as a part of ACPICA's garbage collection mechanism. This mechanism keeps track of references to heap-allocated structures such as the ACPI operand objects. Recent server firmware has revealed that this reference count can overflow on large servers that declare many field units under the same operation_region. This occurs because each field unit declaration will add a reference count to the source operation_region. This change solves the reference count overflow for operation_regions objects by preventing fieldunits from incrementing their operation_region's reference count. Each operation_region's reference count will not be changed by named objects declared under the Field operator. During namespace deletion, the operation_region namespace node will be deleted and each fieldunit will be deleted without touching the deleted operation_region object. Link: https://github.com/acpica/acpica/commit/e17b28cf Signed-off-by: Erik Kaneda <[email protected]> Signed-off-by: Bob Moore <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]> agp/intel: Fix a memory leak on module initialisation failure [ Upstream commit b975abbd382fe442713a4c233549abb90e57c22b ] In intel_gtt_setup_scratch_page(), pointer "page" is not released if pci_dma_mapping_error() return an error, leading to a memory leak on module initialisation failure. Simply fix this issue by freeing "page" before return. Fixes: 0e87d2b06cb46 ("intel-gtt: initialize our own scratch page") Signed-off-by: Qiushi Wu <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> video: fbdev: sm712fb: fix an issue about iounmap for a wrong address [ Upstream commit 98bd4f72988646c35569e1e838c0ab80d06c77f6 ] the sfb->fb->screen_base is not save the value get by iounmap() when the chip id is 0x720. so iounmap() for address sfb->fb->screen_base is not right. Fixes: 1461d6672864854 ("staging: sm7xxfb: merge sm712fb with fbdev") Cc: Andy Shevchenko <[email protected]> Cc: Sudip Mukherjee <[email protected]> Cc: Teddy Wang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Dejin Zheng <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> console: newport_con: fix an issue about leak related system resources [ Upstream commit fd4b8243877250c05bb24af7fea5567110c9720b ] A call of the function do_take_over_console() can fail here. The corresponding system resources were not released then. Thus add a call of iounmap() and release_mem_region() together with the check of a failure predicate. and also add release_mem_region() on device removal. Fixes: e86bb8acc0fdc ("[PATCH] VT binding: Make newport_con support binding") Suggested-by: Bartlomiej Zolnierkiewicz <[email protected]> Signed-off-by: Dejin Zheng <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> cc: Thomas Gleixner <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> video: pxafb: Fix the function used to balance a 'dma_alloc_coherent()' call [ Upstream commit 499a2c41b954518c372873202d5e7714e22010c4 ] 'dma_alloc_coherent()' must be balanced by a call to 'dma_free_coherent()' not 'dma_free_wc()'. The correct dma_free_ function is already used in the error handling path of the probe function. Fixes: 77e196752bdd ("[ARM] pxafb: allow video memory size to be configurable") Signed-off-by: Christophe JAILLET <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Jani Nikula <[email protected]> cc: Mauro Carvalho Chehab <[email protected]> Cc: Eric Miao <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> iio: improve IIO_CONCENTRATION channel type description [ Upstream commit df16c33a4028159d1ba8a7061c9fa950b58d1a61 ] IIO_CONCENTRATION together with INFO_RAW specifier is used for reporting raw concentrations of pollutants. Raw value should be meaningless before being properly scaled. Because of that description shouldn't mention raw value unit whatsoever. Fix this by rephrasing existing description so it follows conventions used throughout IIO ABI docs. Fixes: 8ff6b3bc94930 ("iio: chemical: Add IIO_CONCENTRATION channel type") Signed-off-by: Tomasz Duszynski <[email protected]> Acked-by: Matt Ranostay <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/arm: fix unintentional integer overflow on left shift [ Upstream commit 5f368ddea6fec519bdb93b5368f6a844b6ea27a6 ] Shifting the integer value 1 is evaluated using 32-bit arithmetic and then used in an expression that expects a long value leads to a potential integer overflow. Fix this by using the BIT macro to perform the shift to avoid the overflow. Addresses-Coverity: ("Unintentional integer overflow") Fixes: ad49f8602fe8 ("drm/arm: Add support for Mali Display Processors") Signed-off-by: Colin Ian King <[email protected]> Acked-by: Liviu Dudau <[email protected]> Signed-off-by: Liviu Dudau <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> leds: lm355x: avoid enum conversion warning [ Upstream commit 985b1f596f9ed56f42b8c2280005f943e1434c06 ] clang points out that doing arithmetic between diffent enums is usually a mistake: drivers/leds/leds-lm355x.c:167:28: warning: bitwise operation between different enumeration types ('enum lm355x_tx2' and 'enum lm355x_ntc') [-Wenum-enum-conversion] reg_val = pdata->pin_tx2 | pdata->ntc_pin; ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ drivers/leds/leds-lm355x.c:178:28: warning: bitwise operation between different enumeration types ('enum lm355x_tx2' and 'enum lm355x_ntc') [-Wenum-enum-conversion] reg_val = pdata->pin_tx2 | pdata->ntc_pin | pdata->pass_mode; ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ In this driver, it is intentional, so add a cast to hide the false-positive warning. It appears to be the only instance of this warning at the moment. Fixes: b98d13c72592 ("leds: Add new LED driver for lm355x chips") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Pavel Machek <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: omap3isp: Add missed v4l2_ctrl_handler_free() for preview_init_entities() [ Upstream commit dc7690a73017e1236202022e26a6aa133f239c8c ] preview_init_entities() does not call v4l2_ctrl_handler_free() when it fails. Add the missed function to fix it. Fixes: de1135d44f4f ("[media] omap3isp: CCDC, preview engine and resizer") Signed-off-by: Chuhong Yuan <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Signed-off-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ASoC: Intel: bxt_rt298: add missing .owner field [ Upstream commit 88cee34b776f80d2da04afb990c2a28c36799c43 ] This field is required for ASoC cards. Not setting it will result in a module->name pointer being NULL and generate problems such as cat /proc/asound/modules 0 (efault) Fixes: 76016322ec56 ('ASoC: Intel: Add Broxton-P machine driver') Reported-by: Jaroslav Kysela <[email protected]> Suggested-by: Takashi Iwai <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: cumana_2: Fix different dev_id between request_irq() and free_irq() [ Upstream commit 040ab9c4fd0070cd5fa71ba3a7b95b8470db9b4d ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Acked-by: Russell King <[email protected]> Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/mipi: use dcs write for mipi_dsi_dcs_set_tear_scanline [ Upstream commit 7a05c3b6d24b8460b3cec436cf1d33fac43c8450 ] The helper uses the MIPI_DCS_SET_TEAR_SCANLINE, although it's currently using the generic write. This does not look right. Perhaps some platforms don't distinguish between the two writers? Cc: Robert Chiras <[email protected]> Cc: Vinay Simha BN <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Thierry Reding <[email protected]> Fixes: e83950816367 ("drm/dsi: Implement set tear scanline") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Thierry Reding <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> cxl: Fix kobject memleak [ Upstream commit 85c5cbeba8f4fb28e6b9bfb3e467718385f78f76 ] Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Fixes: b087e6190ddc ("cxl: Export optional AFU configuration record in sysfs") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Acked-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/radeon: fix array out-of-bounds read and write issues [ Upstream commit 7ee78aff9de13d5dccba133f4a0de5367194b243 ] There is an off-by-one bounds check on the index into arrays table->mc_reg_address and table->mc_reg_table_entry[k].mc_data[j] that can lead to reads and writes outside of arrays. Fix the bound checking off-by-one error. Addresses-Coverity: ("Out-of-bounds read/write") Fixes: cc8dbbb4f62a ("drm/radeon: add dpm support for CI dGPUs (v2)") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: powertec: Fix different dev_id between request_irq() and free_irq() [ Upstream commit d179f7c763241c1dc5077fca88ddc3c47d21b763 ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: eesox: Fix different dev_id between request_irq() and free_irq() [ Upstream commit 86f2da1112ccf744ad9068b1d5d9843faf8ddee6 ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ipvs: allow connection reuse for unconfirmed conntrack [ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 https://github.com/kubernetes/kubernetes/issues/70747 - Apache Bench can fill up ipvs service proxy in seconds #544 https://github.com/cloudnativelabs/kube-router/issues/544 - Additional 1s latency in `host -> service IP -> pod` https://github.com/kubernetes/kubernetes/issues/90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: firewire: Using uninitialized values in node_probe() [ Upstream commit 2505a210fc126599013aec2be741df20aaacc490 ] If fw_csr_string() returns -ENOENT, then "name" is uninitialized. So then the "strlen(model_names[i]) <= name_len" is true because strlen() is unsigned and -ENOENT is type promoted to a very high positive value. Then the "strncmp(name, model_names[i], name_len)" uses uninitialized data because "name" is uninitialized. Fixes: 92374e886c75 ("[media] firedtv: drop obsolete backend abstraction") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: exynos4-is: Add missed check for pinctrl_lookup_state() [ Upstream commit 18ffec750578f7447c288647d7282c7d12b1d969 ] fimc_md_get_pinctrl() misses a check for pinctrl_lookup_state(). Add the missed check to fix it. Fixes: 4163851f7b99 ("[media] s5p-fimc: Use pinctrl API for camera ports configuration]") Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> xfs: fix reflink quota reservation accounting error [ Upstream commit 83895227aba1ade33e81f586aa7b6b1e143096a5 ] Quota reservations are supposed to account for the blocks that might be allocated due to a bmap btree split. Reflink doesn't do this, so fix this to make the quota accounting more accurate before we start rearranging things. Fixes: 862bb360ef56 ("xfs: reflink extents from one file to another") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Sasha Levin <[email protected]> PCI: Fix pci_cfg_wait queue locking problem [ Upstream commit 2a7e32d0547f41c5ce244f84cf5d6ca7fccee7eb ] The pci_cfg_wait queue is used to prevent user-space config accesses to devices while they are recovering from reset. Previously we used these operations on pci_cfg_wait: __add_wait_queue(&pci_cfg_wait, ...) __remove_wait_queue(&pci_cfg_wait, ...) wake_up_all(&pci_cfg_wait) The wake_up acquires the wait queue lock, but the add and remove do not. Originally these were all protected by the pci_lock, but cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock"), moved wake_up_all() outside pci_lock, so it could race with add/remove operations, which caused occasional kernel panics, e.g., during vfio-pci hotplug/unplug testing: Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000 Resolve this by using wait_event() instead of __add_wait_queue() and __remove_wait_queue(). The wait queue lock is held by both wait_event() and wake_up_all(), so it provides mutual exclusion. Fixes: cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock") Link: https://lore.kernel.org/linux-pci/[email protected]/T/#u Based-on: https://lore.kernel.org/linux-pci/[email protected]/ Based-on-patch-by: Xiang Zheng <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Tested-by: Xiang Zheng <[email protected]> Cc: Heyi Guo <[email protected]> Cc: Biaoxiang Ye <[email protected]> Signed-off-by: Sasha Levin <[email protected]> leds: core: Flush scheduled work for system suspend [ Upstream commit 302a085c20194bfa7df52e0fe684ee0c41da02e6 ] Sometimes LED won't be turned off by LED_CORE_SUSPENDRESUME flag upon system suspend. led_set_brightness_nopm() uses schedule_work() to set LED brightness. However, there's no guarantee that the scheduled work gets executed because no one flushes the work. So flush the scheduled work to make sure LED gets turned off. Signed-off-by: Kai-Heng Feng <[email protected]> Acked-by: Jacek Anaszewski <[email protected]> Fixes: 81fe8e5b73e3 ("leds: core: Add led_set_brightness_nosleep{nopm} functions") Signed-off-by: Pavel Machek <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm: panel: simple: Fix bpc for LG LB070WV8 panel [ Upstream commit a6ae2fe5c9f9fd355a48fb7d21c863e5b20d6c9c ] The LG LB070WV8 panel incorrectly reports a 16 bits per component value, while the panel uses 8 bits per component. Fix it. Fixes: dd0150026901 ("drm/panel: simple: Add support for LG LB070WV8 800x480 7" panel") Signed-off-by: Laurent Pinchart <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> drm/bridge: sil_sii8620: initialize return of sii8620_readb [ Upstream commit 02cd2d3144653e6e2a0c7ccaa73311e48e2dc686 ] clang static analysis flags this error sil-sii8620.c:184:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn] return ret; ^~~~~~~~~~ sii8620_readb calls sii8620_read_buf. sii8620_read_buf can return without setting its output pararmeter 'ret'. So initialize ret. Fixes: ce6e153f414a ("drm/bridge: add Silicon Image SiI8620 driver") Signed-off-by: Tom Rix <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> scsi: scsi_debug: Add check for sdebug_max_queue during module init [ Upstream commit c87bf24cfb60bce27b4d2c7e56ebfd86fb9d16bb ] sdebug_max_queue should not exceed SDEBUG_CANQUEUE, otherwise crashes like this can be triggered by passing an out-of-range value: Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019 pstate: 20400009 (nzCv daif +PAN -UAO BTYPE=--) pc : schedule_resp+0x2a4/0xa70 [scsi_debug] lr : schedule_resp+0x52c/0xa70 [scsi_debug] sp : ffff800022ab36f0 x29: ffff800022ab36f0 x28: ffff0023a935a610 x27: ffff800008e0a648 x26: 0000000000000003 x25: ffff0023e84f3200 x24: 00000000003d0900 x23: 0000000000000000 x22: 0000000000000000 x21: ffff0023be60a320 x20: ffff0023be60b538 x19: ffff800008e13000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000001 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 00000000000000c1 x5 : 0000020000200000 x4 : dead0000000000ff x3 : 0000000000000200 x2 : 0000000000000200 x1 : ffff800008e13d88 x0 : 0000000000000000 Call trace: schedule_resp+0x2a4/0xa70 [scsi_debug] scsi_debug_queuecommand+0x2c4/0x9e0 [scsi_debug] scsi_queue_rq+0x698/0x840 __blk_mq_try_issue_directly+0x108/0x228 blk_mq_request_issue_directly+0x58/0x98 blk_mq_try_issue_list_directly+0x5c/0xf0 blk_mq_sched_insert_requests+0x18c/0x200 blk_mq_flush_plug_list+0x11c/0x190 blk_flush_plug_list+0xdc/0x110 blk_finish_plug+0x38/0x210 blkdev_direct_IO+0x450/0x4d8 generic_file_read_iter+0x84/0x180 blkdev_read_iter+0x3c/0x50 aio_read+0xc0/0x170 io_submit_one+0x5c8/0xc98 __arm64_sys_io_submit+0x1b0/0x258 el0_svc_common.constprop.3+0x68/0x170 do_el0_svc+0x24/0x90 el0_sync_handler+0x13c/0x1a8 el0_sync+0x158/0x180 Code: 528847e0 72a001e0 6b00003f 540018cd (3941c340) In addition, it should not be less than 1. So add checks for these, and fail the module init for those cases. [mkp: changed if condition to match error message] Link: https://lore.kernel.org/r/[email protected] Fixes: c483739430f1 ("scsi_debug: add multiple queue support") Reviewed-by: Ming Lei <[email protected]> Acked-by: Douglas Gilbert <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> mwifiex: Prevent memory corruption handling keys [ Upstream commit e18696786548244914f36ec3c46ac99c53df99c3 ] The length of the key comes from the network and it's a 16 bit number. It needs to be capped to prevent a buffer overflow. Fixes: 5e6e3a92b9a4 ("wireless: mwifiex: initial commit for Marvell mwifiex driver") Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Ganapathi Bhat <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/20200708115857.GA13729@mwanda Signed-off-by: Sasha Levin <[email protected]> powerpc/vdso: Fix vdso cpu truncation [ Upstream commit a9f675f950a07d5c1dbcbb97aabac56f5ed085e3 ] The code in vdso_cpu_init that exposes the cpu and numa node to userspace via SPRG_VDSO incorrctly masks the cpu to 12 bits. This means that any kernel running on a box with more than 4096 threads (NR_CPUS advertises a limit of of 8192 cpus) would expose userspace to two cpu contexts running at the same time with the same cpu number. Note: I'm not aware of any distro shipping a kernel with support for more than 4096 threads today, nor of any system image that currently exceeds 4096 threads. Found via code browsing. Fixes: 18ad51dd342a7eb09dbcd059d0b451b616d4dafc ("powerpc: Add VDSO version of getcpu") Signed-off-by: Milton Miller <[email protected]> Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> staging: rtl8192u: fix a dubious looking mask before a shift [ Upstream commit c4283950a9a4d3bf4a3f362e406c80ab14f10714 ] Currently the masking of ret with 0xff and followed by a right shift of 8 bits always leaves a zero result. It appears the mask of 0xff is incorrect and should be 0xff00, but I don't have the hardware to test this. Fix this to mask the upper 8 bits before shifting. [ Not tested ] Addresses-Coverity: ("Operands don't affect result") Fixes: 8fc8598e61f6 ("Staging: Added Realtek rtl8192u driver to staging") Signed-off-by: Colin Ian King <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> PCI/ASPM: Add missing newline in sysfs 'policy' [ Upstream commit 3167e3d340c092fd47924bc4d23117a3074ef9a9 ] When I cat ASPM parameter 'policy' by sysfs, it displays as follows. Add a newline for easy reading. Other sysfs attributes already include a newline. [root@localhost ~]# cat /sys/module/pcie_aspm/parameters/policy [default] performance powersave powersupersave [root@localhost ~]# Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiongfeng Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/imx: tve: fix regulator_disable error path [ Upstream commit 7bb58b987fee26da2a1665c01033022624986b7c ] Add missing regulator_disable() as devm_action to avoid dedicated unbind() callback and fix the missing error handling. Fixes: fcbc51e54d2a ("staging: drm/imx: Add support for Television Encoder (TVEv2)") Signed-off-by: Marco Felsch <[email protected]> Signed-off-by: Philipp Zabel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> USB: serial: iuu_phoenix: fix led-activity helpers [ Upstream commit de37458f8c2bfc465500a1dd0d15dbe96d2a698c ] The set-led command is eight bytes long and starts with a command byte followed by six bytes of RGB data and ends with a byte encoding a frequency (see iuu_led() and iuu_rgbf_fill_buffer()). The led activity helpers had a few long-standing bugs which corrupted the command packets by inserting a second command byte and thereby offsetting the RGB data and dropping the frequency in non-xmas mode. In xmas mode, a related off-by-one error left the frequency field uninitialised. Fixes: 60a8fc017103 ("USB: add iuu_phoenix driver") Reported-by: George Spelvin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Sasha Levin <[email protected]> thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor() [ Upstream commit 0f348db01fdf128813fdd659fcc339038fb421a4 ] This condition is reversed and will cause breakage. Fixes: 7440f518dad9 ("thermal/drivers/ti-soc-thermal: Avoid dereferencing ERR_PTR") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Link: https://lore.kernel.org/r/20200616091949.GA11940@mwanda Signed-off-by: Sasha Levin <[email protected]> coresight: tmc: Fix TMC mode read in tmc_read_unprepare_etb() [ Upstream commit d021f5c5ff679432c5e9faee0fd7350db2efb97c ] Reading TMC mode register without proper coresight power management can lead to exceptions like the one in the call trace below in tmc_read_unprepare_etb() when the trace data is read after the sink is disabled. So fix this by having a check for coresight sysfs mode before reading TMC mode management register in tmc_read_unprepare_etb() similar to tmc_read_prepare_etb(). SError Interrupt on CPU6, code 0xbe000411 -- SError pstate: 80400089 (Nzcv daIf +PAN -UAO) pc : tmc_read_unprepare_etb+0x74/0x108 lr : tmc_read_unprepare_etb+0x54/0x108 sp : ffffff80d9507c30 x29: ffffff80d9507c30 x28: ffffff80b3569a0c x27: 0000000000000000 x26: 00000000000a0001 x25: ffffff80cbae9550 x24: 0000000000000010 x23: ffffffd07296b0f0 x22: ffffffd0109ee028 x21: 0000000000000000 x20: ffffff80d19e70e0 x19: ffffff80d19e7080 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: dfffffd000000001 x9 : 0000000000000000 x8 : 0000000000000002 x7 : ffffffd071d0fe78 x6 : 0000000000000000 x5 : 0000000000000080 x4 : 0000000000000001 x3 : ffffffd071d0fe98 x2 : 0000000000000000 x1 : 0000000000000004 x0 : 0000000000000001 Kernel panic - not syncing: Asynchronous SError Interrupt Fixes: 4525412a5046 ("coresight: tmc: making prepare/unprepare functions generic") Reported-by: Mike Leach <[email protected]> Signed-off-by: Sai Prakash Ranjan <[email protected]> Tested-by: Mike Leach <[email protected]> Signed-off-by: Mathieu Poirier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> MIPS: OCTEON: add missing put_device() call in dwc3_octeon_device_init() [ Upstream commit e8b9fc10f2615b9a525fce56981e40b489528355 ] if of_find_device_by_node() succeed, dwc3_octeon_device_init() doesn't have a corresponding put_device(). Thus add put_device() to fix the exception handling for this function implementation. Fixes: 93e502b3c2d4 ("MIPS: OCTEON: Platform support for OCTEON III USB controller") Signed-off-by: Yu Kuai <[email protected]> Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Sasha Levin <[email protected]> usb: dwc2: Fix error path in gadget registration [ Upstream commit 33a06f1300a79cfd461cea0268f05e969d4f34ec ] When gadget registration fails, one should not call usb_del_gadget_udc(). Ensure this by setting gadget->udc to NULL. Also in case of a failure there is no need to disable low-level hardware, so return immiedetly instead of jumping to error_init label. This fixes the following kernel NULL ptr dereference on gadget failure (can be easily triggered with g_mass_storage without any module parameters): dwc2 12480000.hsotg: dwc2_check_params: Invalid parameter besl=1 dwc2 12480000.hsotg: dwc2_check_params: Invalid parameter g_np_tx_fifo_size=1024 dwc2 12480000.hsotg: EPs: 16, dedicated fifos, 7808 entries in SPRAM Mass Storage Function, version: 2009/09/11 LUN: removable file: (no medium) no file given for LUN0 g_mass_storage 12480000.hsotg: failed to start g_mass_storage: -22 8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000104 pgd = (ptrval) [00000104] *pgd=00000000 Internal error: Oops: 805 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.8.0-rc5 #3133 Hardware name: Samsung Exynos (Flattened Device Tree) Workqueue: events deferred_probe_work_func PC is at usb_del_gadget_udc+0x38/0xc4 LR is at __mutex_lock+0x31c/0xb18 ... Process kworker/0:1 (pid: 12, stack limit = 0x(ptrval)) Stack: (0xef121db0 to 0xef122000) ... [<c076bf3c>] (usb_del_gadget_udc) from [<c0726bec>] (dwc2_hsotg_remove+0x10/0x20) [<c0726bec>] (dwc2_hsotg_remove) from [<c0711208>] (dwc2_driver_probe+0x57c/0x69c) [<c0711208>] (dwc2_driver_probe) from [<c06247c0>] (platform_drv_probe+0x6c/0xa4) [<c06247c0>] (platform_drv_probe) from [<c0621df4>] (really_probe+0x200/0x48c) [<c0621df4>] (really_probe) from [<c06221e8>] (driver_probe_device+0x78/0x1fc) [<c06221e8>] (driver_probe_device) from [<c061fcd4>] (bus_for_each_drv+0x74/0xb8) [<c061fcd4>] (bus_for_each_drv) from [<c0621b54>] (__device_attach+0xd4/0x16c) [<c0621b54>] (__device_attach) from [<c0620c98>] (bus_probe_device+0x88/0x90) [<c0620c98>] (bus_probe_device) from [<c06211b0>] (deferred_probe_work_func+0x3c/0xd0) [<c06211b0>] (deferred_probe_work_func) from [<c0149280>] (process_one_work+0x234/0x7dc) [<c0149280>] (process_one_work) from [<c014986c>] (worker_thread+0x44/0x51c) [<c014986c>] (worker_thread) from [<c0150b1c>] (kthread+0x158/0x1a0) [<c0150b1c>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20) Exception stack(0xef121fb0 to 0xef121ff8) ... ---[ end trace 9724c2fc7cc9c982 ]--- While fixing this also fix the double call to dwc2_lowlevel_hw_disable() if dr_mode is set to USB_DR_MODE_PERIPHERAL. In such case low-level hardware is already disabled before calling usb_add_gadget_udc(). That function correctly preserves low-level hardware state, there is no need for the second unconditional dwc2_lowlevel_hw_disable() call. Fixes: 207324a321a8 ("usb: dwc2: Postponed gadget registration to the udc class driver") Acked-by: Minas Harutyunyan <[email protected]> Signed-off-by: Marek Szyprowski <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: mesh: Fix panic after host or bus reset [ Upstream commit edd7dd2292ab9c3628b65c4d04514c3068ad54f6 ] Booting Linux with a Conner CP3200 drive attached to the MESH SCSI bus results in EH measures and a panic: [ 25.499838] mesh: configured for synchronous 5 MB/s [ 25.787154] mesh: performing initial bus reset... [ 29.867115] scsi host0: MESH [ 29.929527] mesh: target 0 synchronous at 3.6 MB/s [ 29.998763] scsi 0:0:0:0: Direct-Access CONNER CP3200-200mb-3.5 4040 PQ: 0 ANSI: 1 CCS [ 31.989975] sd 0:0:0:0: [sda] 415872 512-byte logical blocks: (213 MB/203 MiB) [ 32.070975] sd 0:0:0:0: [sda] Write Protect is off [ 32.137197] sd 0:0:0:0: [sda] Mode Sense: 5b 00 00 08 [ 32.209661] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 32.332708] sda: [mac] sda1 sda2 sda3 [ 32.417733] sd 0:0:0:0: [sda] Attached SCSI disk ... snip ... [ 76.687067] mesh_abort((ptrval)) [ 76.743606] mesh: state at (ptrval), regs at (ptrval), dma at (ptrval) [ 76.810798] ct=6000 seq=86 bs=4017 fc= 0 exc= 0 err= 0 im= 7 int= 0 sp=85 [ 76.880720] dma stat=84e0 cmdptr=1f73d000 [ 76.941387] phase=4 msgphase=0 conn_tgt=0 data_ptr=24576 [ 77.005567] dma_st=1 dma_ct=0 n_msgout=0 [ 77.065456] target 0: req=(ptrval) goes_out=0 saved_ptr=0 [ 77.130512] mesh_abort((ptrval)) [ 77.187670] mesh: state at (ptrval), regs at (ptrval), dma at (ptrval) [ 77.255594] ct=6000 seq=86 bs=4017 fc= 0 exc= 0 err= 0 im= 7 int= 0 sp=85 [ 77.325778] dma stat=84e0 cmdptr=1f73d000 [ 77.387239] phase=4 msgphase=0 conn_tgt=0 data_ptr=24576 [ 77.453665] dma_st=1 dma_ct=0 n_msgout=0 [ 77.515900] target 0: req=(ptrval) goes_out=0 saved_ptr=0 [ 77.582902] mesh_host_reset [ 88.187083] Kernel panic - not syncing: mesh: double DMA start ! [ 88.254510] CPU: 0 PID: 358 Comm: scsi_eh_0 Not tainted 5.6.13-pmac #1 [ 88.323302] Call Trace: [ 88.378854] [e16ddc58] [c0027080] panic+0x13c/0x308 (unreliable) [ 88.446221] [e16ddcb8] [c02b2478] mesh_start.part.12+0x130/0x414 [ 88.513298] [e16ddcf8] [c02b2fc8] mesh_queue+0x54/0x70 [ 88.577097] [e16ddd18] [c02a1848] scsi_send_eh_cmnd+0x374/0x384 [ 88.643476] [e16dddc8] [c02a1938] scsi_eh_tur+0x5c/0xb8 [ 88.707878] [e16dddf8] [c02a1ab8] scsi_eh_test_devices+0x124/0x178 [ 88.775663] [e16dde28] [c02a2094] scsi_eh_ready_devs+0x588/0x8a8 [ 88.843124] [e16dde98] [c02a31d8] scsi_error_handler+0x344/0x520 [ 88.910697] [e16ddf08] [c00409c8] kthread+0xe4/0xe8 [ 88.975166] [e16ddf38] [c000f234] ret_from_kernel_thread+0x14/0x1c [ 89.044112] Rebooting in 180 seconds.. In theory, a panic can happen after a bus or host reset with dma_started flag set. Fix this by halting the DMA before reinitializing the host. Don't assume that ms->current_req is set when halt_dma() is invoked as it may not hold for bus or host reset. BTW, this particular Conner drive can be made to work by inhibiting disconnect/reselect with 'mesh.resel_targets=0'. Link: https://lore.kernel.org/r/3952bc691e150a7128b29120999b6092071b039a.1595460351.git.fthain@telegraphics.com.au Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: Paul Mackerras <[email protected]> Reported-and-tested-by: Stan Johnson <[email protected]> Signed-off-by: Finn Thain <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration [ Upstream commit 0f3c66a3c7b4e8b9f654b3c998e9674376a51b0f ] The MV88E6097 chip does not support configuring jumbo frames. Prior to commit 5f4366660d65 only the 6352, 6351, 6165 and 6320 chips configured jumbo mode. The refactor accidentally added the function for the 6097. Remove the erroneous function pointer assignment. Fixes: 5f4366660d65 ("net: dsa: mv88e6xxx: Refactor setting of jumbo frames") Signed-off-by: Chris Packham <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Smack: fix another vsscanf out of bounds [ Upstream commit a6bd4f6d9b07452b0b19842044a6c3ea384b0b88 ] This is similar to commit 84e99e58e8d1 ("Smack: slab-out-of-bounds in vsscanf") where we added a bounds check on "rule". Reported-by: [email protected] Fixes: f7112e6c9abf ("Smack: allow for significantly longer Smack labels v4") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Smack: prevent underflow in smk_set_cipso() [ Upstream commit 42a2df3e829f3c5562090391b33714b2e2e5ad4a ] We have an upper bound on "maplevel" but forgot to check for negative values. Fixes: e114e473771c ("Smack: Simplified Mandatory Access Control Kernel") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Sasha Levin <[email protected]> power: supply: check if calc_soc succeeded in pm860x_init_battery [ Upstream commit ccf193dee1f0fff55b556928591f7818bac1b3b1 ] clang static analysis flags this error 88pm860x_battery.c:522:19: warning: Assigned value is garbage or undefined [core.uninitialized.Assign] info->start_soc = soc; ^ ~~~ soc is set by calling calc_soc. But calc_soc can return without setting soc. So check the return status and bail similarly to other checks in pm860x_init_battery and initialize soc to silence the warning. Fixes: a830d28b48bf ("power_supply: Enable battery-charger for 88pm860x") Signed-off-by: Tom Rix <[email protected]> Signed-off-by: Sebastian Reichel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Bluetooth: hci_serdev: Only unregister device if it was registered [ Upstream commit 202798db9570104728dce8bb57dfeed47ce764bc ] We should not call hci_unregister_dev if the device was not successfully registered. Fixes: c34dc3bfa7642fd ("Bluetooth: hci_serdev: Introduce hci_uart_unregister_device()") Signed-off-by: Nicolas Boichat <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Sasha Levin <[email protected]> selftests/powerpc: Fix CPU affinity for child process [ Upstream commit 854eb5022be04f81e318765f089f41a57c8e5d83 ] On systems with large number of cpus, test fails trying to set affinity by calling sched_setaffinity() with smaller size for affinity mask. This patch fixes it by making sure that the size of allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs(). Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 benchmark") Reported-by: Shirisha Ganta <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Signed-off-by: Harish <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Reviewed-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> PCI: Release IVRS table in AMD ACS quirk [ Upstream commit 090688fa4e448284aaa16136372397d7d10814db ] The acpi_get_table() should be coupled with acpi_put_table() if the mapped table is not used at runtime to release the table mapping. In pci_quirk_amd_sb_acs(), IVRS table is just used for checking AMD IOMMU is supported, not used at runtime, so put the table after using it. Fixes: 15b100dfd1c9 ("PCI: Claim ACS support for AMD southbridge devices") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hanjun Guo <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Sasha Levin <[email protected]> selftests/powerpc: Fix online CPU selection [ Upstream commit dfa03fff86027e58c8dba5c03ae68150d4e513ad ] The size of the CPU affinity mask must be large enough for systems with a very large number of CPUs. Otherwise, tests which try to determine the first online CPU by calling sched_getaffinity() will fail. This makes sure that the size of the allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs_conf(). Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") Reported-by: Shirisha Ganta <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a408c4b8e9a23bb39b539417a21eb0ff47bb5127.1596084858.git.sandipan@linux.ibm.com Signed-off-by: Sasha Levin <[email protected]> s390/qeth: don't process empty bridge port events [ Upstream commit 02472e28b9a45471c6d8729ff2c7422baa9be46a ] Discard events that don't contain any entries. This shouldn't happen, but subsequent code relies on being able to use entry 0. So better be safe than accessing garbage. Fixes: b4d72c08b358 ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> wl1251: fix always return 0 error [ Upstream commit 20e6421344b5bc2f97b8e2db47b6994368417904 ] wl1251_event_ps_report() should not always return 0 because wl1251_ps_set_mode() may fail. Change it to return 'ret'. Fixes: f7ad1eed4d4b ("wl1251: retry power save entry") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> tools, build: Propagate build failures from tools/build/Makefile.build [ Upstream commit a278f3d8191228212c553a5d4303fa603214b717 ] The '&&' command seems to have a bad effect when $(cmd_$(1)) exits with non-zero effect: the command failure is masked (despite `set -e`) and all but the first command of $(dep-cmd) is executed (successfully, as they are mostly printfs), thus overall returning 0 in the end. This means in practice that despite compilation errors, tools's build Makefile will return success. We see this very reliably with libbpf's Makefile, which doesn't get compilation error propagated properly. This in turns causes issues with selftests build, as well as bpftool and other projects that rely on building libbpf. The fix is simple: don't use &&. Given `set -e`, we don't need to chain commands with &&. The shell will exit on first failure, giving desired behavior and propagating error properly. Fixes: 275e2d95591e ("tools build: Move dependency copy into function") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]> net: ethernet: aquantia: Fix wrong return value [ Upstream commit 0470a48880f8bc42ce26962b79c7b802c5a695ec ] In function hw_atl_a0_hw_multicast_list_set(), when an invalid request is encountered, a negative error code should be returned. Fixes: bab6de8fd180b ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions") Cc: David VomLehn <[email protected]> Signed-off-by: Tianjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> liquidio: Fix wrong return value in cn23xx_get_pf_num() [ Upstream commit aa027850a292ea65524b8fab83eb91a124ad362c ] On an error exit path, a negative error code should be returned instead of a positive return value. Fixes: 0c45d7fe12c7e ("liquidio: fix use of pf in pass-through mode in a virtual machine") Cc: Rick Farrington <[email protected]> Signed-off-by: Tianjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> net: spider_net: Fix the size used in a 'dma_free_coherent()' call [ Upstream commit 36f28f7687a9ce665479cce5d64ce7afaa9e77ae ] Update the size used in 'dma_free_coherent()' in order to match the one used in the corresponding 'dma_alloc_coherent()', in 'spider_net_init_chain()'. Fixes: d4ed8f8d1fb7 ("Spidernet DMA coalescing") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: use 32-bit unsigned integer [ Upstream commit 99f47abd9f7bf6e365820d355dc98f6955a562df ] Potentially overflowing expression (ts_freq << 16 and intgr << 16) declared as type u32 (32-bit unsigned) is evaluated using 32-bit arithmetic and then used in a context that expects an expression of type u64 (64-bit unsigned) which ultimately is used as 16-bit unsigned by typecasting to u16. Fixed by using an unsigned 32-bit integer since the value is truncated anyway in the end. Fixes: 414fd46e7762 ("fsl/fman: Add FMan support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix dereference null return value [ Upstream commit 0572054617f32670abab4b4e89a876954d54b704 ] Check before using returned value to avoid dereferencing null pointer. Fixes: 18a6c85fcc78 ("fsl/fman: Add FMan Port Support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix unreachable code [ Upstream commit cc79fd8f557767de90ff199d3b6fb911df43160a ] The parameter 'priority' is incorrectly forced to zero which ultimately induces logically dead code in the subsequent lines. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: check dereferencing null pointer [ Upstream commit cc5d229a122106733a85c279d89d7703f21e4d4f ] Add a safe check to avoid dereferencing null pointer Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix eth hash table allocation [ Upstream commit 3207f715c34317d08e798e11a10ce816feb53c0f ] Fix memory allocation for ethernet address hash table. The code was wrongly allocating an array for eth hash table which is incorrect because this is the main structure for eth hash table (struct eth_hash_t) that contains inside a number of elements. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> dlm: Fix kobject memleak [ Upstream commit 0ffddafc3a3970ef7013696e7f36b3d378bc4c16 ] Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Set do_unreg = 1 before kobject_init_and_add() to ensure that kobject_put() can be called in its error patch. Fixes: 901195ed7f4b ("Kobject: change GFS2 to use kobject_init_and_add") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: David Teigland <[email protected]> Signed-off-by: Sasha Levin <[email protected]> pinctrl-single: fix pcs_parse_pinconf() return value [ Upstream commit f46fe79ff1b65692a65266a5bec6dbe2bf7fc70f ] This patch causes pcs_parse_pinconf() to return -ENOTSUPP when no pinctrl_map is added. The current behavior is to return 0 when !PCS_HAS_PINCONF or !nconfs. Thus pcs_parse_one_pinctrl_entry() incorrectly assumes that a map was added and sets num_maps = 2. Analysis: ========= The function pcs_parse_one_pinctrl_entry() calls pcs_parse_pinconf() if PCS_HAS_PINCONF is enabled. The function pcs_parse_pinconf() returns 0 to indicate there was no error and num_maps is then set to 2: 980 static int pcs_parse_one_pinctrl_entry(struct pcs_device *pcs, 981 struct device_node *np, 982 struct pinctrl_map **map, 983 unsigned *num_maps, 984 const char **pgnames) 985 { <snip> 1053 (*map)->type = PIN_MAP_TYPE_MUX_GROUP; 1054 (*map)->data.mux.group = np->name; 1055 (*map)->data.mux.function = np->name; 1056 1057 if (PCS_HAS_PINCONF && function) { 1058 res = pcs_parse_pinconf(pcs, np, function, map); 1059 if (res) 1060 goto free_pingroups; 1061 *num_maps = 2; 1062 } else { 1063 *num_maps = 1; 1064 } However, pcs_parse_pinconf() will also return 0 if !PCS_HAS_PINCONF or !nconfs. I believe these conditions should indicate that no map was added by returning -ENOTSUPP. Otherwise pcs_parse_one_pinctrl_entry() will set num_maps = 2 even though no maps were successfully added, as it does not reach "m++" on line 940: 895 static int pcs_parse_pinconf(struct pcs_device *pcs, struct device_node *np, 896 struct pcs_function *func, 897 struct pinctrl_map **map) 898 899 { 900 struct pinctrl_map *m = *map; <snip> 917 /* If pinconf isn't supported, don't parse properties in below. */ 918 if (!PCS_HAS_PINCONF) 919 return 0; 920 921 /* cacluate how much properties are supported in current node */ 922 for (i = 0; i < ARRAY_SIZE(prop2); i++) { 923 if (of_find_property(np, prop2[i].name, NULL)) 924 nconfs++; 925 } 926 for (i = 0; i < ARRAY_SIZE(prop4); i++) { 927 if (of_find_property(np, prop4[i].name, NULL)) 928 nconfs++; 929 } 930 if (!nconfs) 919 return 0; 932 933 func->conf = devm_kcalloc(pcs->dev, 934 nconfs, sizeof(struct pcs_conf_vals), 935 GFP_KERNEL); 936 if (!func->conf) 937 return -ENOMEM; 938 func->nconfs = nconfs; 939 conf = &(func->conf[0]); 940 m++; This situtation will cause a boot failure [0] on the BeagleBone Black (AM3358) when am33xx_pinmux node in arch/arm/boot/dts/am33xx-l4.dtsi has compatible = "pinconf-single" instead of "pinctrl-single". The patch fixes this issue by returning -ENOSUPP when !PCS_HAS_PINCONF or !nconfs, so that pcs_parse_one_pinctrl_entry() will know that no map was added. Logic is also added to pcs_parse_one_pinctrl_entry() to distinguish between -ENOSUPP and other errors. In the case of -ENOSUPP, num_maps is set to 1 as it is valid for pinconf to be enabled and a given pin group to not any pinconf properties. [0] https://lore.kernel.org/linux-omap/20200529175544.GA3766151@x1/ Fixes: 9dddb4df90d1 ("pinctrl: single: support generic pinconf") Signed-off-by: Drew Fustini <[email protected]> Acked-by: Tony Lindgren <[email protected]> Link: https://lore.kernel.org/r/20200608125143.GA2789203@x1 Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Sasha Levin <[email protected]> x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task [ Upstream commit 8ab49526b53d3172d1d8dd03a75c7d1f5bd21239 ] syzbot found its way in 86_fsgsbase_read_task() and triggered this oops: KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 6866 Comm: syz-executor262 Not tainted 5.8.0-syzkaller #0 RIP: 0010:x86_fsgsbase_read_task+0x16d/0x310 arch/x86/kernel/process_64.c:393 Call Trace: putreg32+0x3ab/0x530 arch/x86/kernel/ptrace.c:876 genregs32_set arch/x86/kernel/ptrace.c:1026 [inline] genregs32_set+0xa4/0x100 arch/x86/kernel/ptrace.c:1006 copy_regset_from_user include/linux/regset.h:326 [inline] ia32_arch_ptrace arch/x86/kernel/ptrace.c:1061 [inline] compat_arch_ptrace+0x36c/0xd90 arch/x86/kernel/ptrace.c:1198 __do_compat_sys_ptrace kernel/ptrace.c:1420 [inline] __se_compat_sys_ptrace kernel/ptrace.c:1389 [inline] __ia32_compat_sys_ptrace+0x220/0x2f0 kernel/ptrace.c:1389 do_syscall_32_irqs_on arch/x86/entry/common.c:84 [inline] __do_fast_syscall_32+0x57/0x80 arch/x86/entry/common.c:126 do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:149 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c This can happen if ptrace() or sigreturn() pokes an LDT selector into FS or GS for a task with no LDT and something tries to read the base before a return to usermode notices the bad selector and fixes it. The fix is to make sure ldt pointer is not NULL. Fixes: 07e1d88adaae ("x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately") Co-developed-by: Jann Horn <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: Chang S. Bae <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Markus T Metzger <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Shankar <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]> crypto: aesni - add compatibility with IAS [ Upstream commit 44069737ac9625a0f02f0f7f5ab96aae4cd819bc ] Clang's integrated assembler complains "invalid reassignment of non-absolute variable 'var_ddq_add'" while assembling arch/x86/crypto/aes_ctrby8_avx-x86_64.S. It was because var_ddq_add was reassigned with non-absolute values several times, which IAS did not support. We can avoid the reassignment by replacing the uses of var_ddq_add with its definitions accordingly to have compatilibility with IAS. Link: https://github.com/ClangBuiltLinux/linux/issues/1008 Reported-by: Sedat Dilek <[email protected]> Reported-by: Fangrui Song <[email protected]> Tested-by: Sedat Dilek <[email protected]> # build+boot Linux v5.7.5; clang v11.0.0-git Signed-off-by: Jian Cai <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Sasha Levin <[email protected]> af_packet: TPACKET_V3: fix fill status rwlock imbalance [ Upstream commit 88fd1cb80daa20af063bce81e1fad14e945a8dc4 ] After @blk_fill_in_prog_lock is acquired there is an early out vnet situation that can occur. In that case, the rwlock needs to be released. Also, since @blk_fill_in_prog_lock is only acquired when @tp_version is exactly TPACKET_V3, only release it on that exact condition as well. And finally, add sparse annotation so that it is clearer that prb_fill_curr_block() and prb_clear_blk_fill_status() are acquiring and releasing @blk_fill_in_prog_lock, respectively. sparse is still unable to understand the balance, but the warnings are now on a higher level that make more sense. Fixes: 632ca50f2cbd ("af_packet: TPACKET_V3: replace busy-wait loop") Signed-off-by: John Ogness <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drivers/net/wan/lapbether: Added needed_headroom and a skb->len check [ Upstream commit c7ca03c216acb14466a713fedf1b9f2c24994ef2 ] 1. Added a skb->len check This driver expects upper layers to include a pseudo header of 1 byte when passing down a skb for transmission. This driver will read this 1-byte header. This patch added a skb->len check before reading the header to make sure the header exists. 2. Changed to use needed_headroom instead of hard_header_len to request necessary headroom to be allocated In net/packet/af_packet.c, the function packet_snd first reserves a headroom of length (dev->hard_header_len + dev->needed_headroom). Then if the socket is a SOCK_DGRAM socket, it calls dev_hard_header, which calls dev->header_ops->create, to create the link layer header. If the socket is a SOCK_RAW socket, it "un-reserves" a headroom of length (dev->hard_header_len), and assumes the user to provide the appropriate link layer header. So according to the logic of af_packet.c, dev->hard_header_len should be the length of the header that would be created by dev->header_ops->create. However, this driver doesn't provide dev->header_ops, so logically dev->hard_header_len should be 0. So we should use dev->needed_headroom instead of dev->hard_header_len to request necessary headroom to be allocated. This change fixes kernel panic when this driver is used with AF_PACKET SOCK_RAW sockets. Call stack when panic: [ 168.399197] skbuff: skb_under_panic: text:ffffffff819d95fb len:20 put:14 head:ffff8882704c0a00 data:ffff8882704c09fd tail:0x11 end:0xc0 dev:veth0 ... [ 168.399255] Call Trace: [ 168.399259] skb_push.cold+0x14/0x24 [ 168.399262] eth_header+0x2b/0xc0 [ 168.399267] lapbeth_data_transmit+0x9a/0xb0 [lapbether] [ 168.399275] lapb_data_transmit+0x22/0x2c [lapb] [ 168.399277] lapb_transmit_buffer+0x71/0xb0 [lapb] [ 168.399279] lapb_kick+0xe3/0x1c0 [lapb] [ 168.399281] lapb_data_request+0x76/0xc0 [lapb] [ 168.399283] lapbeth_xmit+0x56/0x90 [lapbether] [ 168.399286] dev_hard_start_xmit+0x91/0x1f0 [ 168.399289] ? irq_init_percpu_irqstack+0xc0/0x100 [ 168.399291] __dev_queue_xmit+0x721/0x8e0 [ 168.399295] ? packet_parse_headers.isra.0+0xd2/0x110 [ 168.399297] dev_queue_xmit+0x10/0x20 [ 168.399298] packet_sendmsg+0xbf0/0x19b0 ...... Cc: Willem de Bruijn <[email protected]> Cc: Martin Schiller <[email protected]> Cc: Brian Norris <[email protected]> Signed-off-by: Xie He <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net/nfc/rawsock.c: add CAP_NET_RAW check. [ Upstream commit 26896f01467a28651f7a536143fe5ac8449d4041 ] When creating a raw AF_NFC socket, CAP_NET_RAW needs to be checked first. Signed-off-by: Qingyu Li <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: refactor bind_bucket fastreuse into helper [ Upstream commit 62ffc589abb176821662efc4525ee4ac0b9c3894 ] Refactor the fastreuse update code in inet_csk_get_port into a small helper function that can be called from other places. Acked-by: Matthieu Baerts <[email protected]> Signed-off-by: Tim Froidcoeur <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: Set fput_needed iff FDPUT_FPUT is set [ Upstream commit ce787a5a074a86f76f5d3fd804fa78e01bfb9e89 ] We should fput() file iff FDPUT_FPUT is set. So we should set fput_needed accordingly. Fixes: 00e188ef6a7e ("sockfd_lookup_light(): switch to fdget^W^Waway from fget_light") Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: serial: cp210x: re-enable auto-RTS on open commit c7614ff9b73a1e6fb2b1b51396da132ed22fecdb upstream. CP210x hardware disables auto-RTS but leaves auto-CTS when in hardware flow control mode and UART on cp210x hardware is disabled. When re-opening the port, if auto-CTS is enabled on the cp210x, then auto-RTS must be re-enabled in the driver. Signed-off-by: Brant Merryman <[email protected]> Co-developed-by: Phu Luu <[email protected]> Signed-off-by: Phu Luu <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ johan: fix up tags and problem description ] Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control") Cc: stable <[email protected]> # 2.6.12 Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: serial: cp210x: enable usb generic throttle/unthrottle commit 4387b3dbb079d482d3c2b43a703ceed4dd27ed28 upstream. Assign the .throttle and .unthrottle functions to be generic function in the driver structure to prevent d…
[ Upstream commit f678ce8cc3cb2ad29df75d8824c74f36398ba871 ] ddebug_describe_flags() currently fills a caller provided string buffer, after testing its size (also passed) in a BUG_ON. Fix this by replacing them with a known-big-enough string buffer wrapped in a struct, and passing that instead. Also simplify ddebug_describe_flags() flags parameter from a struct to a member in that struct, and hoist the member deref up to the caller. This makes the function reusable (soon) where flags are unpacked. Acked-by: <[email protected]> Signed-off-by: Jim Cromie <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> bcache: fix super block seq numbers comparision in register_cache_set() [ Upstream commit 117f636ea695270fe492d0c0c9dfadc7a662af47 ] In register_cache_set(), c is pointer to struct cache_set, and ca is pointer to struct cache, if ca->sb.seq > c->sb.seq, it means this registering cache has up to date version and other members, the in- memory version and other members should be updated to the newer value. But current implementation makes a cache set only has a single cache device, so the above assumption works well except for a special case. The execption is when a cache device new created and both ca->sb.seq and c->sb.seq are 0, because the super block is never flushed out yet. In the location for the following if() check, 2156 if (ca->sb.seq > c->sb.seq) { 2157 c->sb.version = ca->sb.version; 2158 memcpy(c->sb.set_uuid, ca->sb.set_uuid, 16); 2159 c->sb.flags = ca->sb.flags; 2160 c->sb.seq = ca->sb.seq; 2161 pr_debug("set version = %llu\n", c->sb.version); 2162 } c->sb.version is not initialized yet and valued 0. When ca->sb.seq is 0, the if() check will fail (because both values are 0), and the cache set version, set_uuid, flags and seq won't be updated. The above problem is hiden for current code, because the bucket size is compatible among different super block version. And the next time when running cache set again, ca->sb.seq will be larger than 0 and cache set super block version will be updated properly. But if the large bucket feature is enabled, sb->bucket_size is the low 16bits of the bucket size. For a power of 2 value, when the actual bucket size exceeds 16bit width, sb->bucket_size will always be 0. Then read_super_common() will fail because the if() check to is_power_of_2(sb->bucket_size) is false. This is how the long time hidden bug is triggered. This patch modifies the if() check to the following way, 2156 if (ca->sb.seq > c->sb.seq || c->sb.seq == 0) { Then cache set's version, set_uuid, flags and seq will always be updated corectly including for a new created cache device. Signed-off-by: Coly Li <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ACPICA: Do not increment operation_region reference counts for field units [ Upstream commit 6a54ebae6d047c988a31f5ac5a64ab5cf83797a2 ] ACPICA commit e17b28cfcc31918d0db9547b6b274b09c413eb70 Object reference counts are used as a part of ACPICA's garbage collection mechanism. This mechanism keeps track of references to heap-allocated structures such as the ACPI operand objects. Recent server firmware has revealed that this reference count can overflow on large servers that declare many field units under the same operation_region. This occurs because each field unit declaration will add a reference count to the source operation_region. This change solves the reference count overflow for operation_regions objects by preventing fieldunits from incrementing their operation_region's reference count. Each operation_region's reference count will not be changed by named objects declared under the Field operator. During namespace deletion, the operation_region namespace node will be deleted and each fieldunit will be deleted without touching the deleted operation_region object. Link: https://github.com/acpica/acpica/commit/e17b28cf Signed-off-by: Erik Kaneda <[email protected]> Signed-off-by: Bob Moore <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]> agp/intel: Fix a memory leak on module initialisation failure [ Upstream commit b975abbd382fe442713a4c233549abb90e57c22b ] In intel_gtt_setup_scratch_page(), pointer "page" is not released if pci_dma_mapping_error() return an error, leading to a memory leak on module initialisation failure. Simply fix this issue by freeing "page" before return. Fixes: 0e87d2b06cb46 ("intel-gtt: initialize our own scratch page") Signed-off-by: Qiushi Wu <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> video: fbdev: sm712fb: fix an issue about iounmap for a wrong address [ Upstream commit 98bd4f72988646c35569e1e838c0ab80d06c77f6 ] the sfb->fb->screen_base is not save the value get by iounmap() when the chip id is 0x720. so iounmap() for address sfb->fb->screen_base is not right. Fixes: 1461d6672864854 ("staging: sm7xxfb: merge sm712fb with fbdev") Cc: Andy Shevchenko <[email protected]> Cc: Sudip Mukherjee <[email protected]> Cc: Teddy Wang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Dejin Zheng <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> console: newport_con: fix an issue about leak related system resources [ Upstream commit fd4b8243877250c05bb24af7fea5567110c9720b ] A call of the function do_take_over_console() can fail here. The corresponding system resources were not released then. Thus add a call of iounmap() and release_mem_region() together with the check of a failure predicate. and also add release_mem_region() on device removal. Fixes: e86bb8acc0fdc ("[PATCH] VT binding: Make newport_con support binding") Suggested-by: Bartlomiej Zolnierkiewicz <[email protected]> Signed-off-by: Dejin Zheng <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> cc: Thomas Gleixner <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> video: pxafb: Fix the function used to balance a 'dma_alloc_coherent()' call [ Upstream commit 499a2c41b954518c372873202d5e7714e22010c4 ] 'dma_alloc_coherent()' must be balanced by a call to 'dma_free_coherent()' not 'dma_free_wc()'. The correct dma_free_ function is already used in the error handling path of the probe function. Fixes: 77e196752bdd ("[ARM] pxafb: allow video memory size to be configurable") Signed-off-by: Christophe JAILLET <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Jani Nikula <[email protected]> cc: Mauro Carvalho Chehab <[email protected]> Cc: Eric Miao <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> iio: improve IIO_CONCENTRATION channel type description [ Upstream commit df16c33a4028159d1ba8a7061c9fa950b58d1a61 ] IIO_CONCENTRATION together with INFO_RAW specifier is used for reporting raw concentrations of pollutants. Raw value should be meaningless before being properly scaled. Because of that description shouldn't mention raw value unit whatsoever. Fix this by rephrasing existing description so it follows conventions used throughout IIO ABI docs. Fixes: 8ff6b3bc94930 ("iio: chemical: Add IIO_CONCENTRATION channel type") Signed-off-by: Tomasz Duszynski <[email protected]> Acked-by: Matt Ranostay <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/arm: fix unintentional integer overflow on left shift [ Upstream commit 5f368ddea6fec519bdb93b5368f6a844b6ea27a6 ] Shifting the integer value 1 is evaluated using 32-bit arithmetic and then used in an expression that expects a long value leads to a potential integer overflow. Fix this by using the BIT macro to perform the shift to avoid the overflow. Addresses-Coverity: ("Unintentional integer overflow") Fixes: ad49f8602fe8 ("drm/arm: Add support for Mali Display Processors") Signed-off-by: Colin Ian King <[email protected]> Acked-by: Liviu Dudau <[email protected]> Signed-off-by: Liviu Dudau <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> leds: lm355x: avoid enum conversion warning [ Upstream commit 985b1f596f9ed56f42b8c2280005f943e1434c06 ] clang points out that doing arithmetic between diffent enums is usually a mistake: drivers/leds/leds-lm355x.c:167:28: warning: bitwise operation between different enumeration types ('enum lm355x_tx2' and 'enum lm355x_ntc') [-Wenum-enum-conversion] reg_val = pdata->pin_tx2 | pdata->ntc_pin; ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ drivers/leds/leds-lm355x.c:178:28: warning: bitwise operation between different enumeration types ('enum lm355x_tx2' and 'enum lm355x_ntc') [-Wenum-enum-conversion] reg_val = pdata->pin_tx2 | pdata->ntc_pin | pdata->pass_mode; ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ In this driver, it is intentional, so add a cast to hide the false-positive warning. It appears to be the only instance of this warning at the moment. Fixes: b98d13c72592 ("leds: Add new LED driver for lm355x chips") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Pavel Machek <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: omap3isp: Add missed v4l2_ctrl_handler_free() for preview_init_entities() [ Upstream commit dc7690a73017e1236202022e26a6aa133f239c8c ] preview_init_entities() does not call v4l2_ctrl_handler_free() when it fails. Add the missed function to fix it. Fixes: de1135d44f4f ("[media] omap3isp: CCDC, preview engine and resizer") Signed-off-by: Chuhong Yuan <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Signed-off-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ASoC: Intel: bxt_rt298: add missing .owner field [ Upstream commit 88cee34b776f80d2da04afb990c2a28c36799c43 ] This field is required for ASoC cards. Not setting it will result in a module->name pointer being NULL and generate problems such as cat /proc/asound/modules 0 (efault) Fixes: 76016322ec56 ('ASoC: Intel: Add Broxton-P machine driver') Reported-by: Jaroslav Kysela <[email protected]> Suggested-by: Takashi Iwai <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: cumana_2: Fix different dev_id between request_irq() and free_irq() [ Upstream commit 040ab9c4fd0070cd5fa71ba3a7b95b8470db9b4d ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Acked-by: Russell King <[email protected]> Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/mipi: use dcs write for mipi_dsi_dcs_set_tear_scanline [ Upstream commit 7a05c3b6d24b8460b3cec436cf1d33fac43c8450 ] The helper uses the MIPI_DCS_SET_TEAR_SCANLINE, although it's currently using the generic write. This does not look right. Perhaps some platforms don't distinguish between the two writers? Cc: Robert Chiras <[email protected]> Cc: Vinay Simha BN <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Thierry Reding <[email protected]> Fixes: e83950816367 ("drm/dsi: Implement set tear scanline") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Thierry Reding <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> cxl: Fix kobject memleak [ Upstream commit 85c5cbeba8f4fb28e6b9bfb3e467718385f78f76 ] Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Fixes: b087e6190ddc ("cxl: Export optional AFU configuration record in sysfs") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Acked-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/radeon: fix array out-of-bounds read and write issues [ Upstream commit 7ee78aff9de13d5dccba133f4a0de5367194b243 ] There is an off-by-one bounds check on the index into arrays table->mc_reg_address and table->mc_reg_table_entry[k].mc_data[j] that can lead to reads and writes outside of arrays. Fix the bound checking off-by-one error. Addresses-Coverity: ("Out-of-bounds read/write") Fixes: cc8dbbb4f62a ("drm/radeon: add dpm support for CI dGPUs (v2)") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: powertec: Fix different dev_id between request_irq() and free_irq() [ Upstream commit d179f7c763241c1dc5077fca88ddc3c47d21b763 ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: eesox: Fix different dev_id between request_irq() and free_irq() [ Upstream commit 86f2da1112ccf744ad9068b1d5d9843faf8ddee6 ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ipvs: allow connection reuse for unconfirmed conntrack [ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 https://github.com/kubernetes/kubernetes/issues/70747 - Apache Bench can fill up ipvs service proxy in seconds #544 https://github.com/cloudnativelabs/kube-router/issues/544 - Additional 1s latency in `host -> service IP -> pod` https://github.com/kubernetes/kubernetes/issues/90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: firewire: Using uninitialized values in node_probe() [ Upstream commit 2505a210fc126599013aec2be741df20aaacc490 ] If fw_csr_string() returns -ENOENT, then "name" is uninitialized. So then the "strlen(model_names[i]) <= name_len" is true because strlen() is unsigned and -ENOENT is type promoted to a very high positive value. Then the "strncmp(name, model_names[i], name_len)" uses uninitialized data because "name" is uninitialized. Fixes: 92374e886c75 ("[media] firedtv: drop obsolete backend abstraction") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: exynos4-is: Add missed check for pinctrl_lookup_state() [ Upstream commit 18ffec750578f7447c288647d7282c7d12b1d969 ] fimc_md_get_pinctrl() misses a check for pinctrl_lookup_state(). Add the missed check to fix it. Fixes: 4163851f7b99 ("[media] s5p-fimc: Use pinctrl API for camera ports configuration]") Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> xfs: fix reflink quota reservation accounting error [ Upstream commit 83895227aba1ade33e81f586aa7b6b1e143096a5 ] Quota reservations are supposed to account for the blocks that might be allocated due to a bmap btree split. Reflink doesn't do this, so fix this to make the quota accounting more accurate before we start rearranging things. Fixes: 862bb360ef56 ("xfs: reflink extents from one file to another") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Sasha Levin <[email protected]> PCI: Fix pci_cfg_wait queue locking problem [ Upstream commit 2a7e32d0547f41c5ce244f84cf5d6ca7fccee7eb ] The pci_cfg_wait queue is used to prevent user-space config accesses to devices while they are recovering from reset. Previously we used these operations on pci_cfg_wait: __add_wait_queue(&pci_cfg_wait, ...) __remove_wait_queue(&pci_cfg_wait, ...) wake_up_all(&pci_cfg_wait) The wake_up acquires the wait queue lock, but the add and remove do not. Originally these were all protected by the pci_lock, but cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock"), moved wake_up_all() outside pci_lock, so it could race with add/remove operations, which caused occasional kernel panics, e.g., during vfio-pci hotplug/unplug testing: Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000 Resolve this by using wait_event() instead of __add_wait_queue() and __remove_wait_queue(). The wait queue lock is held by both wait_event() and wake_up_all(), so it provides mutual exclusion. Fixes: cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock") Link: https://lore.kernel.org/linux-pci/[email protected]/T/#u Based-on: https://lore.kernel.org/linux-pci/[email protected]/ Based-on-patch-by: Xiang Zheng <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Tested-by: Xiang Zheng <[email protected]> Cc: Heyi Guo <[email protected]> Cc: Biaoxiang Ye <[email protected]> Signed-off-by: Sasha Levin <[email protected]> leds: core: Flush scheduled work for system suspend [ Upstream commit 302a085c20194bfa7df52e0fe684ee0c41da02e6 ] Sometimes LED won't be turned off by LED_CORE_SUSPENDRESUME flag upon system suspend. led_set_brightness_nopm() uses schedule_work() to set LED brightness. However, there's no guarantee that the scheduled work gets executed because no one flushes the work. So flush the scheduled work to make sure LED gets turned off. Signed-off-by: Kai-Heng Feng <[email protected]> Acked-by: Jacek Anaszewski <[email protected]> Fixes: 81fe8e5b73e3 ("leds: core: Add led_set_brightness_nosleep{nopm} functions") Signed-off-by: Pavel Machek <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm: panel: simple: Fix bpc for LG LB070WV8 panel [ Upstream commit a6ae2fe5c9f9fd355a48fb7d21c863e5b20d6c9c ] The LG LB070WV8 panel incorrectly reports a 16 bits per component value, while the panel uses 8 bits per component. Fix it. Fixes: dd0150026901 ("drm/panel: simple: Add support for LG LB070WV8 800x480 7" panel") Signed-off-by: Laurent Pinchart <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> drm/bridge: sil_sii8620: initialize return of sii8620_readb [ Upstream commit 02cd2d3144653e6e2a0c7ccaa73311e48e2dc686 ] clang static analysis flags this error sil-sii8620.c:184:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn] return ret; ^~~~~~~~~~ sii8620_readb calls sii8620_read_buf. sii8620_read_buf can return without setting its output pararmeter 'ret'. So initialize ret. Fixes: ce6e153f414a ("drm/bridge: add Silicon Image SiI8620 driver") Signed-off-by: Tom Rix <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> scsi: scsi_debug: Add check for sdebug_max_queue during module init [ Upstream commit c87bf24cfb60bce27b4d2c7e56ebfd86fb9d16bb ] sdebug_max_queue should not exceed SDEBUG_CANQUEUE, otherwise crashes like this can be triggered by passing an out-of-range value: Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019 pstate: 20400009 (nzCv daif +PAN -UAO BTYPE=--) pc : schedule_resp+0x2a4/0xa70 [scsi_debug] lr : schedule_resp+0x52c/0xa70 [scsi_debug] sp : ffff800022ab36f0 x29: ffff800022ab36f0 x28: ffff0023a935a610 x27: ffff800008e0a648 x26: 0000000000000003 x25: ffff0023e84f3200 x24: 00000000003d0900 x23: 0000000000000000 x22: 0000000000000000 x21: ffff0023be60a320 x20: ffff0023be60b538 x19: ffff800008e13000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000001 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 00000000000000c1 x5 : 0000020000200000 x4 : dead0000000000ff x3 : 0000000000000200 x2 : 0000000000000200 x1 : ffff800008e13d88 x0 : 0000000000000000 Call trace: schedule_resp+0x2a4/0xa70 [scsi_debug] scsi_debug_queuecommand+0x2c4/0x9e0 [scsi_debug] scsi_queue_rq+0x698/0x840 __blk_mq_try_issue_directly+0x108/0x228 blk_mq_request_issue_directly+0x58/0x98 blk_mq_try_issue_list_directly+0x5c/0xf0 blk_mq_sched_insert_requests+0x18c/0x200 blk_mq_flush_plug_list+0x11c/0x190 blk_flush_plug_list+0xdc/0x110 blk_finish_plug+0x38/0x210 blkdev_direct_IO+0x450/0x4d8 generic_file_read_iter+0x84/0x180 blkdev_read_iter+0x3c/0x50 aio_read+0xc0/0x170 io_submit_one+0x5c8/0xc98 __arm64_sys_io_submit+0x1b0/0x258 el0_svc_common.constprop.3+0x68/0x170 do_el0_svc+0x24/0x90 el0_sync_handler+0x13c/0x1a8 el0_sync+0x158/0x180 Code: 528847e0 72a001e0 6b00003f 540018cd (3941c340) In addition, it should not be less than 1. So add checks for these, and fail the module init for those cases. [mkp: changed if condition to match error message] Link: https://lore.kernel.org/r/[email protected] Fixes: c483739430f1 ("scsi_debug: add multiple queue support") Reviewed-by: Ming Lei <[email protected]> Acked-by: Douglas Gilbert <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> mwifiex: Prevent memory corruption handling keys [ Upstream commit e18696786548244914f36ec3c46ac99c53df99c3 ] The length of the key comes from the network and it's a 16 bit number. It needs to be capped to prevent a buffer overflow. Fixes: 5e6e3a92b9a4 ("wireless: mwifiex: initial commit for Marvell mwifiex driver") Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Ganapathi Bhat <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/20200708115857.GA13729@mwanda Signed-off-by: Sasha Levin <[email protected]> powerpc/vdso: Fix vdso cpu truncation [ Upstream commit a9f675f950a07d5c1dbcbb97aabac56f5ed085e3 ] The code in vdso_cpu_init that exposes the cpu and numa node to userspace via SPRG_VDSO incorrctly masks the cpu to 12 bits. This means that any kernel running on a box with more than 4096 threads (NR_CPUS advertises a limit of of 8192 cpus) would expose userspace to two cpu contexts running at the same time with the same cpu number. Note: I'm not aware of any distro shipping a kernel with support for more than 4096 threads today, nor of any system image that currently exceeds 4096 threads. Found via code browsing. Fixes: 18ad51dd342a7eb09dbcd059d0b451b616d4dafc ("powerpc: Add VDSO version of getcpu") Signed-off-by: Milton Miller <[email protected]> Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> staging: rtl8192u: fix a dubious looking mask before a shift [ Upstream commit c4283950a9a4d3bf4a3f362e406c80ab14f10714 ] Currently the masking of ret with 0xff and followed by a right shift of 8 bits always leaves a zero result. It appears the mask of 0xff is incorrect and should be 0xff00, but I don't have the hardware to test this. Fix this to mask the upper 8 bits before shifting. [ Not tested ] Addresses-Coverity: ("Operands don't affect result") Fixes: 8fc8598e61f6 ("Staging: Added Realtek rtl8192u driver to staging") Signed-off-by: Colin Ian King <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> PCI/ASPM: Add missing newline in sysfs 'policy' [ Upstream commit 3167e3d340c092fd47924bc4d23117a3074ef9a9 ] When I cat ASPM parameter 'policy' by sysfs, it displays as follows. Add a newline for easy reading. Other sysfs attributes already include a newline. [root@localhost ~]# cat /sys/module/pcie_aspm/parameters/policy [default] performance powersave powersupersave [root@localhost ~]# Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiongfeng Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/imx: tve: fix regulator_disable error path [ Upstream commit 7bb58b987fee26da2a1665c01033022624986b7c ] Add missing regulator_disable() as devm_action to avoid dedicated unbind() callback and fix the missing error handling. Fixes: fcbc51e54d2a ("staging: drm/imx: Add support for Television Encoder (TVEv2)") Signed-off-by: Marco Felsch <[email protected]> Signed-off-by: Philipp Zabel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> USB: serial: iuu_phoenix: fix led-activity helpers [ Upstream commit de37458f8c2bfc465500a1dd0d15dbe96d2a698c ] The set-led command is eight bytes long and starts with a command byte followed by six bytes of RGB data and ends with a byte encoding a frequency (see iuu_led() and iuu_rgbf_fill_buffer()). The led activity helpers had a few long-standing bugs which corrupted the command packets by inserting a second command byte and thereby offsetting the RGB data and dropping the frequency in non-xmas mode. In xmas mode, a related off-by-one error left the frequency field uninitialised. Fixes: 60a8fc017103 ("USB: add iuu_phoenix driver") Reported-by: George Spelvin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Sasha Levin <[email protected]> thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor() [ Upstream commit 0f348db01fdf128813fdd659fcc339038fb421a4 ] This condition is reversed and will cause breakage. Fixes: 7440f518dad9 ("thermal/drivers/ti-soc-thermal: Avoid dereferencing ERR_PTR") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Link: https://lore.kernel.org/r/20200616091949.GA11940@mwanda Signed-off-by: Sasha Levin <[email protected]> coresight: tmc: Fix TMC mode read in tmc_read_unprepare_etb() [ Upstream commit d021f5c5ff679432c5e9faee0fd7350db2efb97c ] Reading TMC mode register without proper coresight power management can lead to exceptions like the one in the call trace below in tmc_read_unprepare_etb() when the trace data is read after the sink is disabled. So fix this by having a check for coresight sysfs mode before reading TMC mode management register in tmc_read_unprepare_etb() similar to tmc_read_prepare_etb(). SError Interrupt on CPU6, code 0xbe000411 -- SError pstate: 80400089 (Nzcv daIf +PAN -UAO) pc : tmc_read_unprepare_etb+0x74/0x108 lr : tmc_read_unprepare_etb+0x54/0x108 sp : ffffff80d9507c30 x29: ffffff80d9507c30 x28: ffffff80b3569a0c x27: 0000000000000000 x26: 00000000000a0001 x25: ffffff80cbae9550 x24: 0000000000000010 x23: ffffffd07296b0f0 x22: ffffffd0109ee028 x21: 0000000000000000 x20: ffffff80d19e70e0 x19: ffffff80d19e7080 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: dfffffd000000001 x9 : 0000000000000000 x8 : 0000000000000002 x7 : ffffffd071d0fe78 x6 : 0000000000000000 x5 : 0000000000000080 x4 : 0000000000000001 x3 : ffffffd071d0fe98 x2 : 0000000000000000 x1 : 0000000000000004 x0 : 0000000000000001 Kernel panic - not syncing: Asynchronous SError Interrupt Fixes: 4525412a5046 ("coresight: tmc: making prepare/unprepare functions generic") Reported-by: Mike Leach <[email protected]> Signed-off-by: Sai Prakash Ranjan <[email protected]> Tested-by: Mike Leach <[email protected]> Signed-off-by: Mathieu Poirier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> MIPS: OCTEON: add missing put_device() call in dwc3_octeon_device_init() [ Upstream commit e8b9fc10f2615b9a525fce56981e40b489528355 ] if of_find_device_by_node() succeed, dwc3_octeon_device_init() doesn't have a corresponding put_device(). Thus add put_device() to fix the exception handling for this function implementation. Fixes: 93e502b3c2d4 ("MIPS: OCTEON: Platform support for OCTEON III USB controller") Signed-off-by: Yu Kuai <[email protected]> Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Sasha Levin <[email protected]> usb: dwc2: Fix error path in gadget registration [ Upstream commit 33a06f1300a79cfd461cea0268f05e969d4f34ec ] When gadget registration fails, one should not call usb_del_gadget_udc(). Ensure this by setting gadget->udc to NULL. Also in case of a failure there is no need to disable low-level hardware, so return immiedetly instead of jumping to error_init label. This fixes the following kernel NULL ptr dereference on gadget failure (can be easily triggered with g_mass_storage without any module parameters): dwc2 12480000.hsotg: dwc2_check_params: Invalid parameter besl=1 dwc2 12480000.hsotg: dwc2_check_params: Invalid parameter g_np_tx_fifo_size=1024 dwc2 12480000.hsotg: EPs: 16, dedicated fifos, 7808 entries in SPRAM Mass Storage Function, version: 2009/09/11 LUN: removable file: (no medium) no file given for LUN0 g_mass_storage 12480000.hsotg: failed to start g_mass_storage: -22 8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000104 pgd = (ptrval) [00000104] *pgd=00000000 Internal error: Oops: 805 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.8.0-rc5 #3133 Hardware name: Samsung Exynos (Flattened Device Tree) Workqueue: events deferred_probe_work_func PC is at usb_del_gadget_udc+0x38/0xc4 LR is at __mutex_lock+0x31c/0xb18 ... Process kworker/0:1 (pid: 12, stack limit = 0x(ptrval)) Stack: (0xef121db0 to 0xef122000) ... [<c076bf3c>] (usb_del_gadget_udc) from [<c0726bec>] (dwc2_hsotg_remove+0x10/0x20) [<c0726bec>] (dwc2_hsotg_remove) from [<c0711208>] (dwc2_driver_probe+0x57c/0x69c) [<c0711208>] (dwc2_driver_probe) from [<c06247c0>] (platform_drv_probe+0x6c/0xa4) [<c06247c0>] (platform_drv_probe) from [<c0621df4>] (really_probe+0x200/0x48c) [<c0621df4>] (really_probe) from [<c06221e8>] (driver_probe_device+0x78/0x1fc) [<c06221e8>] (driver_probe_device) from [<c061fcd4>] (bus_for_each_drv+0x74/0xb8) [<c061fcd4>] (bus_for_each_drv) from [<c0621b54>] (__device_attach+0xd4/0x16c) [<c0621b54>] (__device_attach) from [<c0620c98>] (bus_probe_device+0x88/0x90) [<c0620c98>] (bus_probe_device) from [<c06211b0>] (deferred_probe_work_func+0x3c/0xd0) [<c06211b0>] (deferred_probe_work_func) from [<c0149280>] (process_one_work+0x234/0x7dc) [<c0149280>] (process_one_work) from [<c014986c>] (worker_thread+0x44/0x51c) [<c014986c>] (worker_thread) from [<c0150b1c>] (kthread+0x158/0x1a0) [<c0150b1c>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20) Exception stack(0xef121fb0 to 0xef121ff8) ... ---[ end trace 9724c2fc7cc9c982 ]--- While fixing this also fix the double call to dwc2_lowlevel_hw_disable() if dr_mode is set to USB_DR_MODE_PERIPHERAL. In such case low-level hardware is already disabled before calling usb_add_gadget_udc(). That function correctly preserves low-level hardware state, there is no need for the second unconditional dwc2_lowlevel_hw_disable() call. Fixes: 207324a321a8 ("usb: dwc2: Postponed gadget registration to the udc class driver") Acked-by: Minas Harutyunyan <[email protected]> Signed-off-by: Marek Szyprowski <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: mesh: Fix panic after host or bus reset [ Upstream commit edd7dd2292ab9c3628b65c4d04514c3068ad54f6 ] Booting Linux with a Conner CP3200 drive attached to the MESH SCSI bus results in EH measures and a panic: [ 25.499838] mesh: configured for synchronous 5 MB/s [ 25.787154] mesh: performing initial bus reset... [ 29.867115] scsi host0: MESH [ 29.929527] mesh: target 0 synchronous at 3.6 MB/s [ 29.998763] scsi 0:0:0:0: Direct-Access CONNER CP3200-200mb-3.5 4040 PQ: 0 ANSI: 1 CCS [ 31.989975] sd 0:0:0:0: [sda] 415872 512-byte logical blocks: (213 MB/203 MiB) [ 32.070975] sd 0:0:0:0: [sda] Write Protect is off [ 32.137197] sd 0:0:0:0: [sda] Mode Sense: 5b 00 00 08 [ 32.209661] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 32.332708] sda: [mac] sda1 sda2 sda3 [ 32.417733] sd 0:0:0:0: [sda] Attached SCSI disk ... snip ... [ 76.687067] mesh_abort((ptrval)) [ 76.743606] mesh: state at (ptrval), regs at (ptrval), dma at (ptrval) [ 76.810798] ct=6000 seq=86 bs=4017 fc= 0 exc= 0 err= 0 im= 7 int= 0 sp=85 [ 76.880720] dma stat=84e0 cmdptr=1f73d000 [ 76.941387] phase=4 msgphase=0 conn_tgt=0 data_ptr=24576 [ 77.005567] dma_st=1 dma_ct=0 n_msgout=0 [ 77.065456] target 0: req=(ptrval) goes_out=0 saved_ptr=0 [ 77.130512] mesh_abort((ptrval)) [ 77.187670] mesh: state at (ptrval), regs at (ptrval), dma at (ptrval) [ 77.255594] ct=6000 seq=86 bs=4017 fc= 0 exc= 0 err= 0 im= 7 int= 0 sp=85 [ 77.325778] dma stat=84e0 cmdptr=1f73d000 [ 77.387239] phase=4 msgphase=0 conn_tgt=0 data_ptr=24576 [ 77.453665] dma_st=1 dma_ct=0 n_msgout=0 [ 77.515900] target 0: req=(ptrval) goes_out=0 saved_ptr=0 [ 77.582902] mesh_host_reset [ 88.187083] Kernel panic - not syncing: mesh: double DMA start ! [ 88.254510] CPU: 0 PID: 358 Comm: scsi_eh_0 Not tainted 5.6.13-pmac #1 [ 88.323302] Call Trace: [ 88.378854] [e16ddc58] [c0027080] panic+0x13c/0x308 (unreliable) [ 88.446221] [e16ddcb8] [c02b2478] mesh_start.part.12+0x130/0x414 [ 88.513298] [e16ddcf8] [c02b2fc8] mesh_queue+0x54/0x70 [ 88.577097] [e16ddd18] [c02a1848] scsi_send_eh_cmnd+0x374/0x384 [ 88.643476] [e16dddc8] [c02a1938] scsi_eh_tur+0x5c/0xb8 [ 88.707878] [e16dddf8] [c02a1ab8] scsi_eh_test_devices+0x124/0x178 [ 88.775663] [e16dde28] [c02a2094] scsi_eh_ready_devs+0x588/0x8a8 [ 88.843124] [e16dde98] [c02a31d8] scsi_error_handler+0x344/0x520 [ 88.910697] [e16ddf08] [c00409c8] kthread+0xe4/0xe8 [ 88.975166] [e16ddf38] [c000f234] ret_from_kernel_thread+0x14/0x1c [ 89.044112] Rebooting in 180 seconds.. In theory, a panic can happen after a bus or host reset with dma_started flag set. Fix this by halting the DMA before reinitializing the host. Don't assume that ms->current_req is set when halt_dma() is invoked as it may not hold for bus or host reset. BTW, this particular Conner drive can be made to work by inhibiting disconnect/reselect with 'mesh.resel_targets=0'. Link: https://lore.kernel.org/r/3952bc691e150a7128b29120999b6092071b039a.1595460351.git.fthain@telegraphics.com.au Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: Paul Mackerras <[email protected]> Reported-and-tested-by: Stan Johnson <[email protected]> Signed-off-by: Finn Thain <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration [ Upstream commit 0f3c66a3c7b4e8b9f654b3c998e9674376a51b0f ] The MV88E6097 chip does not support configuring jumbo frames. Prior to commit 5f4366660d65 only the 6352, 6351, 6165 and 6320 chips configured jumbo mode. The refactor accidentally added the function for the 6097. Remove the erroneous function pointer assignment. Fixes: 5f4366660d65 ("net: dsa: mv88e6xxx: Refactor setting of jumbo frames") Signed-off-by: Chris Packham <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Smack: fix another vsscanf out of bounds [ Upstream commit a6bd4f6d9b07452b0b19842044a6c3ea384b0b88 ] This is similar to commit 84e99e58e8d1 ("Smack: slab-out-of-bounds in vsscanf") where we added a bounds check on "rule". Reported-by: [email protected] Fixes: f7112e6c9abf ("Smack: allow for significantly longer Smack labels v4") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Smack: prevent underflow in smk_set_cipso() [ Upstream commit 42a2df3e829f3c5562090391b33714b2e2e5ad4a ] We have an upper bound on "maplevel" but forgot to check for negative values. Fixes: e114e473771c ("Smack: Simplified Mandatory Access Control Kernel") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Sasha Levin <[email protected]> power: supply: check if calc_soc succeeded in pm860x_init_battery [ Upstream commit ccf193dee1f0fff55b556928591f7818bac1b3b1 ] clang static analysis flags this error 88pm860x_battery.c:522:19: warning: Assigned value is garbage or undefined [core.uninitialized.Assign] info->start_soc = soc; ^ ~~~ soc is set by calling calc_soc. But calc_soc can return without setting soc. So check the return status and bail similarly to other checks in pm860x_init_battery and initialize soc to silence the warning. Fixes: a830d28b48bf ("power_supply: Enable battery-charger for 88pm860x") Signed-off-by: Tom Rix <[email protected]> Signed-off-by: Sebastian Reichel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Bluetooth: hci_serdev: Only unregister device if it was registered [ Upstream commit 202798db9570104728dce8bb57dfeed47ce764bc ] We should not call hci_unregister_dev if the device was not successfully registered. Fixes: c34dc3bfa7642fd ("Bluetooth: hci_serdev: Introduce hci_uart_unregister_device()") Signed-off-by: Nicolas Boichat <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Sasha Levin <[email protected]> selftests/powerpc: Fix CPU affinity for child process [ Upstream commit 854eb5022be04f81e318765f089f41a57c8e5d83 ] On systems with large number of cpus, test fails trying to set affinity by calling sched_setaffinity() with smaller size for affinity mask. This patch fixes it by making sure that the size of allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs(). Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 benchmark") Reported-by: Shirisha Ganta <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Signed-off-by: Harish <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Reviewed-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> PCI: Release IVRS table in AMD ACS quirk [ Upstream commit 090688fa4e448284aaa16136372397d7d10814db ] The acpi_get_table() should be coupled with acpi_put_table() if the mapped table is not used at runtime to release the table mapping. In pci_quirk_amd_sb_acs(), IVRS table is just used for checking AMD IOMMU is supported, not used at runtime, so put the table after using it. Fixes: 15b100dfd1c9 ("PCI: Claim ACS support for AMD southbridge devices") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hanjun Guo <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Sasha Levin <[email protected]> selftests/powerpc: Fix online CPU selection [ Upstream commit dfa03fff86027e58c8dba5c03ae68150d4e513ad ] The size of the CPU affinity mask must be large enough for systems with a very large number of CPUs. Otherwise, tests which try to determine the first online CPU by calling sched_getaffinity() will fail. This makes sure that the size of the allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs_conf(). Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") Reported-by: Shirisha Ganta <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a408c4b8e9a23bb39b539417a21eb0ff47bb5127.1596084858.git.sandipan@linux.ibm.com Signed-off-by: Sasha Levin <[email protected]> s390/qeth: don't process empty bridge port events [ Upstream commit 02472e28b9a45471c6d8729ff2c7422baa9be46a ] Discard events that don't contain any entries. This shouldn't happen, but subsequent code relies on being able to use entry 0. So better be safe than accessing garbage. Fixes: b4d72c08b358 ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> wl1251: fix always return 0 error [ Upstream commit 20e6421344b5bc2f97b8e2db47b6994368417904 ] wl1251_event_ps_report() should not always return 0 because wl1251_ps_set_mode() may fail. Change it to return 'ret'. Fixes: f7ad1eed4d4b ("wl1251: retry power save entry") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> tools, build: Propagate build failures from tools/build/Makefile.build [ Upstream commit a278f3d8191228212c553a5d4303fa603214b717 ] The '&&' command seems to have a bad effect when $(cmd_$(1)) exits with non-zero effect: the command failure is masked (despite `set -e`) and all but the first command of $(dep-cmd) is executed (successfully, as they are mostly printfs), thus overall returning 0 in the end. This means in practice that despite compilation errors, tools's build Makefile will return success. We see this very reliably with libbpf's Makefile, which doesn't get compilation error propagated properly. This in turns causes issues with selftests build, as well as bpftool and other projects that rely on building libbpf. The fix is simple: don't use &&. Given `set -e`, we don't need to chain commands with &&. The shell will exit on first failure, giving desired behavior and propagating error properly. Fixes: 275e2d95591e ("tools build: Move dependency copy into function") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]> net: ethernet: aquantia: Fix wrong return value [ Upstream commit 0470a48880f8bc42ce26962b79c7b802c5a695ec ] In function hw_atl_a0_hw_multicast_list_set(), when an invalid request is encountered, a negative error code should be returned. Fixes: bab6de8fd180b ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions") Cc: David VomLehn <[email protected]> Signed-off-by: Tianjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> liquidio: Fix wrong return value in cn23xx_get_pf_num() [ Upstream commit aa027850a292ea65524b8fab83eb91a124ad362c ] On an error exit path, a negative error code should be returned instead of a positive return value. Fixes: 0c45d7fe12c7e ("liquidio: fix use of pf in pass-through mode in a virtual machine") Cc: Rick Farrington <[email protected]> Signed-off-by: Tianjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> net: spider_net: Fix the size used in a 'dma_free_coherent()' call [ Upstream commit 36f28f7687a9ce665479cce5d64ce7afaa9e77ae ] Update the size used in 'dma_free_coherent()' in order to match the one used in the corresponding 'dma_alloc_coherent()', in 'spider_net_init_chain()'. Fixes: d4ed8f8d1fb7 ("Spidernet DMA coalescing") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: use 32-bit unsigned integer [ Upstream commit 99f47abd9f7bf6e365820d355dc98f6955a562df ] Potentially overflowing expression (ts_freq << 16 and intgr << 16) declared as type u32 (32-bit unsigned) is evaluated using 32-bit arithmetic and then used in a context that expects an expression of type u64 (64-bit unsigned) which ultimately is used as 16-bit unsigned by typecasting to u16. Fixed by using an unsigned 32-bit integer since the value is truncated anyway in the end. Fixes: 414fd46e7762 ("fsl/fman: Add FMan support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix dereference null return value [ Upstream commit 0572054617f32670abab4b4e89a876954d54b704 ] Check before using returned value to avoid dereferencing null pointer. Fixes: 18a6c85fcc78 ("fsl/fman: Add FMan Port Support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix unreachable code [ Upstream commit cc79fd8f557767de90ff199d3b6fb911df43160a ] The parameter 'priority' is incorrectly forced to zero which ultimately induces logically dead code in the subsequent lines. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: check dereferencing null pointer [ Upstream commit cc5d229a122106733a85c279d89d7703f21e4d4f ] Add a safe check to avoid dereferencing null pointer Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix eth hash table allocation [ Upstream commit 3207f715c34317d08e798e11a10ce816feb53c0f ] Fix memory allocation for ethernet address hash table. The code was wrongly allocating an array for eth hash table which is incorrect because this is the main structure for eth hash table (struct eth_hash_t) that contains inside a number of elements. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> dlm: Fix kobject memleak [ Upstream commit 0ffddafc3a3970ef7013696e7f36b3d378bc4c16 ] Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Set do_unreg = 1 before kobject_init_and_add() to ensure that kobject_put() can be called in its error patch. Fixes: 901195ed7f4b ("Kobject: change GFS2 to use kobject_init_and_add") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: David Teigland <[email protected]> Signed-off-by: Sasha Levin <[email protected]> pinctrl-single: fix pcs_parse_pinconf() return value [ Upstream commit f46fe79ff1b65692a65266a5bec6dbe2bf7fc70f ] This patch causes pcs_parse_pinconf() to return -ENOTSUPP when no pinctrl_map is added. The current behavior is to return 0 when !PCS_HAS_PINCONF or !nconfs. Thus pcs_parse_one_pinctrl_entry() incorrectly assumes that a map was added and sets num_maps = 2. Analysis: ========= The function pcs_parse_one_pinctrl_entry() calls pcs_parse_pinconf() if PCS_HAS_PINCONF is enabled. The function pcs_parse_pinconf() returns 0 to indicate there was no error and num_maps is then set to 2: 980 static int pcs_parse_one_pinctrl_entry(struct pcs_device *pcs, 981 struct device_node *np, 982 struct pinctrl_map **map, 983 unsigned *num_maps, 984 const char **pgnames) 985 { <snip> 1053 (*map)->type = PIN_MAP_TYPE_MUX_GROUP; 1054 (*map)->data.mux.group = np->name; 1055 (*map)->data.mux.function = np->name; 1056 1057 if (PCS_HAS_PINCONF && function) { 1058 res = pcs_parse_pinconf(pcs, np, function, map); 1059 if (res) 1060 goto free_pingroups; 1061 *num_maps = 2; 1062 } else { 1063 *num_maps = 1; 1064 } However, pcs_parse_pinconf() will also return 0 if !PCS_HAS_PINCONF or !nconfs. I believe these conditions should indicate that no map was added by returning -ENOTSUPP. Otherwise pcs_parse_one_pinctrl_entry() will set num_maps = 2 even though no maps were successfully added, as it does not reach "m++" on line 940: 895 static int pcs_parse_pinconf(struct pcs_device *pcs, struct device_node *np, 896 struct pcs_function *func, 897 struct pinctrl_map **map) 898 899 { 900 struct pinctrl_map *m = *map; <snip> 917 /* If pinconf isn't supported, don't parse properties in below. */ 918 if (!PCS_HAS_PINCONF) 919 return 0; 920 921 /* cacluate how much properties are supported in current node */ 922 for (i = 0; i < ARRAY_SIZE(prop2); i++) { 923 if (of_find_property(np, prop2[i].name, NULL)) 924 nconfs++; 925 } 926 for (i = 0; i < ARRAY_SIZE(prop4); i++) { 927 if (of_find_property(np, prop4[i].name, NULL)) 928 nconfs++; 929 } 930 if (!nconfs) 919 return 0; 932 933 func->conf = devm_kcalloc(pcs->dev, 934 nconfs, sizeof(struct pcs_conf_vals), 935 GFP_KERNEL); 936 if (!func->conf) 937 return -ENOMEM; 938 func->nconfs = nconfs; 939 conf = &(func->conf[0]); 940 m++; This situtation will cause a boot failure [0] on the BeagleBone Black (AM3358) when am33xx_pinmux node in arch/arm/boot/dts/am33xx-l4.dtsi has compatible = "pinconf-single" instead of "pinctrl-single". The patch fixes this issue by returning -ENOSUPP when !PCS_HAS_PINCONF or !nconfs, so that pcs_parse_one_pinctrl_entry() will know that no map was added. Logic is also added to pcs_parse_one_pinctrl_entry() to distinguish between -ENOSUPP and other errors. In the case of -ENOSUPP, num_maps is set to 1 as it is valid for pinconf to be enabled and a given pin group to not any pinconf properties. [0] https://lore.kernel.org/linux-omap/20200529175544.GA3766151@x1/ Fixes: 9dddb4df90d1 ("pinctrl: single: support generic pinconf") Signed-off-by: Drew Fustini <[email protected]> Acked-by: Tony Lindgren <[email protected]> Link: https://lore.kernel.org/r/20200608125143.GA2789203@x1 Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Sasha Levin <[email protected]> x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task [ Upstream commit 8ab49526b53d3172d1d8dd03a75c7d1f5bd21239 ] syzbot found its way in 86_fsgsbase_read_task() and triggered this oops: KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 6866 Comm: syz-executor262 Not tainted 5.8.0-syzkaller #0 RIP: 0010:x86_fsgsbase_read_task+0x16d/0x310 arch/x86/kernel/process_64.c:393 Call Trace: putreg32+0x3ab/0x530 arch/x86/kernel/ptrace.c:876 genregs32_set arch/x86/kernel/ptrace.c:1026 [inline] genregs32_set+0xa4/0x100 arch/x86/kernel/ptrace.c:1006 copy_regset_from_user include/linux/regset.h:326 [inline] ia32_arch_ptrace arch/x86/kernel/ptrace.c:1061 [inline] compat_arch_ptrace+0x36c/0xd90 arch/x86/kernel/ptrace.c:1198 __do_compat_sys_ptrace kernel/ptrace.c:1420 [inline] __se_compat_sys_ptrace kernel/ptrace.c:1389 [inline] __ia32_compat_sys_ptrace+0x220/0x2f0 kernel/ptrace.c:1389 do_syscall_32_irqs_on arch/x86/entry/common.c:84 [inline] __do_fast_syscall_32+0x57/0x80 arch/x86/entry/common.c:126 do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:149 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c This can happen if ptrace() or sigreturn() pokes an LDT selector into FS or GS for a task with no LDT and something tries to read the base before a return to usermode notices the bad selector and fixes it. The fix is to make sure ldt pointer is not NULL. Fixes: 07e1d88adaae ("x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately") Co-developed-by: Jann Horn <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: Chang S. Bae <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Markus T Metzger <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Shankar <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]> crypto: aesni - add compatibility with IAS [ Upstream commit 44069737ac9625a0f02f0f7f5ab96aae4cd819bc ] Clang's integrated assembler complains "invalid reassignment of non-absolute variable 'var_ddq_add'" while assembling arch/x86/crypto/aes_ctrby8_avx-x86_64.S. It was because var_ddq_add was reassigned with non-absolute values several times, which IAS did not support. We can avoid the reassignment by replacing the uses of var_ddq_add with its definitions accordingly to have compatilibility with IAS. Link: https://github.com/ClangBuiltLinux/linux/issues/1008 Reported-by: Sedat Dilek <[email protected]> Reported-by: Fangrui Song <[email protected]> Tested-by: Sedat Dilek <[email protected]> # build+boot Linux v5.7.5; clang v11.0.0-git Signed-off-by: Jian Cai <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Sasha Levin <[email protected]> af_packet: TPACKET_V3: fix fill status rwlock imbalance [ Upstream commit 88fd1cb80daa20af063bce81e1fad14e945a8dc4 ] After @blk_fill_in_prog_lock is acquired there is an early out vnet situation that can occur. In that case, the rwlock needs to be released. Also, since @blk_fill_in_prog_lock is only acquired when @tp_version is exactly TPACKET_V3, only release it on that exact condition as well. And finally, add sparse annotation so that it is clearer that prb_fill_curr_block() and prb_clear_blk_fill_status() are acquiring and releasing @blk_fill_in_prog_lock, respectively. sparse is still unable to understand the balance, but the warnings are now on a higher level that make more sense. Fixes: 632ca50f2cbd ("af_packet: TPACKET_V3: replace busy-wait loop") Signed-off-by: John Ogness <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drivers/net/wan/lapbether: Added needed_headroom and a skb->len check [ Upstream commit c7ca03c216acb14466a713fedf1b9f2c24994ef2 ] 1. Added a skb->len check This driver expects upper layers to include a pseudo header of 1 byte when passing down a skb for transmission. This driver will read this 1-byte header. This patch added a skb->len check before reading the header to make sure the header exists. 2. Changed to use needed_headroom instead of hard_header_len to request necessary headroom to be allocated In net/packet/af_packet.c, the function packet_snd first reserves a headroom of length (dev->hard_header_len + dev->needed_headroom). Then if the socket is a SOCK_DGRAM socket, it calls dev_hard_header, which calls dev->header_ops->create, to create the link layer header. If the socket is a SOCK_RAW socket, it "un-reserves" a headroom of length (dev->hard_header_len), and assumes the user to provide the appropriate link layer header. So according to the logic of af_packet.c, dev->hard_header_len should be the length of the header that would be created by dev->header_ops->create. However, this driver doesn't provide dev->header_ops, so logically dev->hard_header_len should be 0. So we should use dev->needed_headroom instead of dev->hard_header_len to request necessary headroom to be allocated. This change fixes kernel panic when this driver is used with AF_PACKET SOCK_RAW sockets. Call stack when panic: [ 168.399197] skbuff: skb_under_panic: text:ffffffff819d95fb len:20 put:14 head:ffff8882704c0a00 data:ffff8882704c09fd tail:0x11 end:0xc0 dev:veth0 ... [ 168.399255] Call Trace: [ 168.399259] skb_push.cold+0x14/0x24 [ 168.399262] eth_header+0x2b/0xc0 [ 168.399267] lapbeth_data_transmit+0x9a/0xb0 [lapbether] [ 168.399275] lapb_data_transmit+0x22/0x2c [lapb] [ 168.399277] lapb_transmit_buffer+0x71/0xb0 [lapb] [ 168.399279] lapb_kick+0xe3/0x1c0 [lapb] [ 168.399281] lapb_data_request+0x76/0xc0 [lapb] [ 168.399283] lapbeth_xmit+0x56/0x90 [lapbether] [ 168.399286] dev_hard_start_xmit+0x91/0x1f0 [ 168.399289] ? irq_init_percpu_irqstack+0xc0/0x100 [ 168.399291] __dev_queue_xmit+0x721/0x8e0 [ 168.399295] ? packet_parse_headers.isra.0+0xd2/0x110 [ 168.399297] dev_queue_xmit+0x10/0x20 [ 168.399298] packet_sendmsg+0xbf0/0x19b0 ...... Cc: Willem de Bruijn <[email protected]> Cc: Martin Schiller <[email protected]> Cc: Brian Norris <[email protected]> Signed-off-by: Xie He <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net/nfc/rawsock.c: add CAP_NET_RAW check. [ Upstream commit 26896f01467a28651f7a536143fe5ac8449d4041 ] When creating a raw AF_NFC socket, CAP_NET_RAW needs to be checked first. Signed-off-by: Qingyu Li <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: refactor bind_bucket fastreuse into helper [ Upstream commit 62ffc589abb176821662efc4525ee4ac0b9c3894 ] Refactor the fastreuse update code in inet_csk_get_port into a small helper function that can be called from other places. Acked-by: Matthieu Baerts <[email protected]> Signed-off-by: Tim Froidcoeur <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: Set fput_needed iff FDPUT_FPUT is set [ Upstream commit ce787a5a074a86f76f5d3fd804fa78e01bfb9e89 ] We should fput() file iff FDPUT_FPUT is set. So we should set fput_needed accordingly. Fixes: 00e188ef6a7e ("sockfd_lookup_light(): switch to fdget^W^Waway from fget_light") Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: serial: cp210x: re-enable auto-RTS on open commit c7614ff9b73a1e6fb2b1b51396da132ed22fecdb upstream. CP210x hardware disables auto-RTS but leaves auto-CTS when in hardware flow control mode and UART on cp210x hardware is disabled. When re-opening the port, if auto-CTS is enabled on the cp210x, then auto-RTS must be re-enabled in the driver. Signed-off-by: Brant Merryman <[email protected]> Co-developed-by: Phu Luu <[email protected]> Signed-off-by: Phu Luu <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ johan: fix up tags and problem description ] Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control") Cc: stable <[email protected]> # 2.6.12 Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: serial: cp210x: enable usb generic throttle/unthrottle commit 4387b3dbb079d482d3c2b43a703ceed4dd27ed28 upstream. Assign the .throttle and .unthrottle functions to be generic function in the driver structure to prevent d…
[ Upstream commit f678ce8cc3cb2ad29df75d8824c74f36398ba871 ] ddebug_describe_flags() currently fills a caller provided string buffer, after testing its size (also passed) in a BUG_ON. Fix this by replacing them with a known-big-enough string buffer wrapped in a struct, and passing that instead. Also simplify ddebug_describe_flags() flags parameter from a struct to a member in that struct, and hoist the member deref up to the caller. This makes the function reusable (soon) where flags are unpacked. Acked-by: <[email protected]> Signed-off-by: Jim Cromie <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> bcache: fix super block seq numbers comparision in register_cache_set() [ Upstream commit 117f636ea695270fe492d0c0c9dfadc7a662af47 ] In register_cache_set(), c is pointer to struct cache_set, and ca is pointer to struct cache, if ca->sb.seq > c->sb.seq, it means this registering cache has up to date version and other members, the in- memory version and other members should be updated to the newer value. But current implementation makes a cache set only has a single cache device, so the above assumption works well except for a special case. The execption is when a cache device new created and both ca->sb.seq and c->sb.seq are 0, because the super block is never flushed out yet. In the location for the following if() check, 2156 if (ca->sb.seq > c->sb.seq) { 2157 c->sb.version = ca->sb.version; 2158 memcpy(c->sb.set_uuid, ca->sb.set_uuid, 16); 2159 c->sb.flags = ca->sb.flags; 2160 c->sb.seq = ca->sb.seq; 2161 pr_debug("set version = %llu\n", c->sb.version); 2162 } c->sb.version is not initialized yet and valued 0. When ca->sb.seq is 0, the if() check will fail (because both values are 0), and the cache set version, set_uuid, flags and seq won't be updated. The above problem is hiden for current code, because the bucket size is compatible among different super block version. And the next time when running cache set again, ca->sb.seq will be larger than 0 and cache set super block version will be updated properly. But if the large bucket feature is enabled, sb->bucket_size is the low 16bits of the bucket size. For a power of 2 value, when the actual bucket size exceeds 16bit width, sb->bucket_size will always be 0. Then read_super_common() will fail because the if() check to is_power_of_2(sb->bucket_size) is false. This is how the long time hidden bug is triggered. This patch modifies the if() check to the following way, 2156 if (ca->sb.seq > c->sb.seq || c->sb.seq == 0) { Then cache set's version, set_uuid, flags and seq will always be updated corectly including for a new created cache device. Signed-off-by: Coly Li <[email protected]> Reviewed-by: Hannes Reinecke <[email protected]> Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ACPICA: Do not increment operation_region reference counts for field units [ Upstream commit 6a54ebae6d047c988a31f5ac5a64ab5cf83797a2 ] ACPICA commit e17b28cfcc31918d0db9547b6b274b09c413eb70 Object reference counts are used as a part of ACPICA's garbage collection mechanism. This mechanism keeps track of references to heap-allocated structures such as the ACPI operand objects. Recent server firmware has revealed that this reference count can overflow on large servers that declare many field units under the same operation_region. This occurs because each field unit declaration will add a reference count to the source operation_region. This change solves the reference count overflow for operation_regions objects by preventing fieldunits from incrementing their operation_region's reference count. Each operation_region's reference count will not be changed by named objects declared under the Field operator. During namespace deletion, the operation_region namespace node will be deleted and each fieldunit will be deleted without touching the deleted operation_region object. Link: https://github.com/acpica/acpica/commit/e17b28cf Signed-off-by: Erik Kaneda <[email protected]> Signed-off-by: Bob Moore <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]> agp/intel: Fix a memory leak on module initialisation failure [ Upstream commit b975abbd382fe442713a4c233549abb90e57c22b ] In intel_gtt_setup_scratch_page(), pointer "page" is not released if pci_dma_mapping_error() return an error, leading to a memory leak on module initialisation failure. Simply fix this issue by freeing "page" before return. Fixes: 0e87d2b06cb46 ("intel-gtt: initialize our own scratch page") Signed-off-by: Qiushi Wu <[email protected]> Reviewed-by: Chris Wilson <[email protected]> Signed-off-by: Chris Wilson <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> video: fbdev: sm712fb: fix an issue about iounmap for a wrong address [ Upstream commit 98bd4f72988646c35569e1e838c0ab80d06c77f6 ] the sfb->fb->screen_base is not save the value get by iounmap() when the chip id is 0x720. so iounmap() for address sfb->fb->screen_base is not right. Fixes: 1461d6672864854 ("staging: sm7xxfb: merge sm712fb with fbdev") Cc: Andy Shevchenko <[email protected]> Cc: Sudip Mukherjee <[email protected]> Cc: Teddy Wang <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> Signed-off-by: Dejin Zheng <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> console: newport_con: fix an issue about leak related system resources [ Upstream commit fd4b8243877250c05bb24af7fea5567110c9720b ] A call of the function do_take_over_console() can fail here. The corresponding system resources were not released then. Thus add a call of iounmap() and release_mem_region() together with the check of a failure predicate. and also add release_mem_region() on device removal. Fixes: e86bb8acc0fdc ("[PATCH] VT binding: Make newport_con support binding") Suggested-by: Bartlomiej Zolnierkiewicz <[email protected]> Signed-off-by: Dejin Zheng <[email protected]> Reviewed-by: Andy Shevchenko <[email protected]> Cc: Greg Kroah-Hartman <[email protected]> cc: Thomas Gleixner <[email protected]> Cc: Andrew Morton <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> video: pxafb: Fix the function used to balance a 'dma_alloc_coherent()' call [ Upstream commit 499a2c41b954518c372873202d5e7714e22010c4 ] 'dma_alloc_coherent()' must be balanced by a call to 'dma_free_coherent()' not 'dma_free_wc()'. The correct dma_free_ function is already used in the error handling path of the probe function. Fixes: 77e196752bdd ("[ARM] pxafb: allow video memory size to be configurable") Signed-off-by: Christophe JAILLET <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Rafael J. Wysocki <[email protected]> Cc: Jonathan Corbet <[email protected]> Cc: Viresh Kumar <[email protected]> Cc: Jani Nikula <[email protected]> cc: Mauro Carvalho Chehab <[email protected]> Cc: Eric Miao <[email protected]> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> iio: improve IIO_CONCENTRATION channel type description [ Upstream commit df16c33a4028159d1ba8a7061c9fa950b58d1a61 ] IIO_CONCENTRATION together with INFO_RAW specifier is used for reporting raw concentrations of pollutants. Raw value should be meaningless before being properly scaled. Because of that description shouldn't mention raw value unit whatsoever. Fix this by rephrasing existing description so it follows conventions used throughout IIO ABI docs. Fixes: 8ff6b3bc94930 ("iio: chemical: Add IIO_CONCENTRATION channel type") Signed-off-by: Tomasz Duszynski <[email protected]> Acked-by: Matt Ranostay <[email protected]> Signed-off-by: Jonathan Cameron <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/arm: fix unintentional integer overflow on left shift [ Upstream commit 5f368ddea6fec519bdb93b5368f6a844b6ea27a6 ] Shifting the integer value 1 is evaluated using 32-bit arithmetic and then used in an expression that expects a long value leads to a potential integer overflow. Fix this by using the BIT macro to perform the shift to avoid the overflow. Addresses-Coverity: ("Unintentional integer overflow") Fixes: ad49f8602fe8 ("drm/arm: Add support for Mali Display Processors") Signed-off-by: Colin Ian King <[email protected]> Acked-by: Liviu Dudau <[email protected]> Signed-off-by: Liviu Dudau <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> leds: lm355x: avoid enum conversion warning [ Upstream commit 985b1f596f9ed56f42b8c2280005f943e1434c06 ] clang points out that doing arithmetic between diffent enums is usually a mistake: drivers/leds/leds-lm355x.c:167:28: warning: bitwise operation between different enumeration types ('enum lm355x_tx2' and 'enum lm355x_ntc') [-Wenum-enum-conversion] reg_val = pdata->pin_tx2 | pdata->ntc_pin; ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ drivers/leds/leds-lm355x.c:178:28: warning: bitwise operation between different enumeration types ('enum lm355x_tx2' and 'enum lm355x_ntc') [-Wenum-enum-conversion] reg_val = pdata->pin_tx2 | pdata->ntc_pin | pdata->pass_mode; ~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~ In this driver, it is intentional, so add a cast to hide the false-positive warning. It appears to be the only instance of this warning at the moment. Fixes: b98d13c72592 ("leds: Add new LED driver for lm355x chips") Signed-off-by: Arnd Bergmann <[email protected]> Signed-off-by: Pavel Machek <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: omap3isp: Add missed v4l2_ctrl_handler_free() for preview_init_entities() [ Upstream commit dc7690a73017e1236202022e26a6aa133f239c8c ] preview_init_entities() does not call v4l2_ctrl_handler_free() when it fails. Add the missed function to fix it. Fixes: de1135d44f4f ("[media] omap3isp: CCDC, preview engine and resizer") Signed-off-by: Chuhong Yuan <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Signed-off-by: Sakari Ailus <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ASoC: Intel: bxt_rt298: add missing .owner field [ Upstream commit 88cee34b776f80d2da04afb990c2a28c36799c43 ] This field is required for ASoC cards. Not setting it will result in a module->name pointer being NULL and generate problems such as cat /proc/asound/modules 0 (efault) Fixes: 76016322ec56 ('ASoC: Intel: Add Broxton-P machine driver') Reported-by: Jaroslav Kysela <[email protected]> Suggested-by: Takashi Iwai <[email protected]> Signed-off-by: Pierre-Louis Bossart <[email protected]> Reviewed-by: Kai Vehmanen <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Mark Brown <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: cumana_2: Fix different dev_id between request_irq() and free_irq() [ Upstream commit 040ab9c4fd0070cd5fa71ba3a7b95b8470db9b4d ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Acked-by: Russell King <[email protected]> Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/mipi: use dcs write for mipi_dsi_dcs_set_tear_scanline [ Upstream commit 7a05c3b6d24b8460b3cec436cf1d33fac43c8450 ] The helper uses the MIPI_DCS_SET_TEAR_SCANLINE, although it's currently using the generic write. This does not look right. Perhaps some platforms don't distinguish between the two writers? Cc: Robert Chiras <[email protected]> Cc: Vinay Simha BN <[email protected]> Cc: Jani Nikula <[email protected]> Cc: Thierry Reding <[email protected]> Fixes: e83950816367 ("drm/dsi: Implement set tear scanline") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Thierry Reding <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> cxl: Fix kobject memleak [ Upstream commit 85c5cbeba8f4fb28e6b9bfb3e467718385f78f76 ] Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Fix it by adding a call to kobject_put() in the error path of kobject_init_and_add(). Fixes: b087e6190ddc ("cxl: Export optional AFU configuration record in sysfs") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Acked-by: Andrew Donnellan <[email protected]> Acked-by: Frederic Barrat <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/radeon: fix array out-of-bounds read and write issues [ Upstream commit 7ee78aff9de13d5dccba133f4a0de5367194b243 ] There is an off-by-one bounds check on the index into arrays table->mc_reg_address and table->mc_reg_table_entry[k].mc_data[j] that can lead to reads and writes outside of arrays. Fix the bound checking off-by-one error. Addresses-Coverity: ("Out-of-bounds read/write") Fixes: cc8dbbb4f62a ("drm/radeon: add dpm support for CI dGPUs (v2)") Signed-off-by: Colin Ian King <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: powertec: Fix different dev_id between request_irq() and free_irq() [ Upstream commit d179f7c763241c1dc5077fca88ddc3c47d21b763 ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: eesox: Fix different dev_id between request_irq() and free_irq() [ Upstream commit 86f2da1112ccf744ad9068b1d5d9843faf8ddee6 ] The dev_id used in request_irq() and free_irq() should match. Use 'info' in both cases. Link: https://lore.kernel.org/r/[email protected] Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> ipvs: allow connection reuse for unconfirmed conntrack [ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 https://github.com/kubernetes/kubernetes/issues/70747 - Apache Bench can fill up ipvs service proxy in seconds #544 https://github.com/cloudnativelabs/kube-router/issues/544 - Additional 1s latency in `host -> service IP -> pod` https://github.com/kubernetes/kubernetes/issues/90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: firewire: Using uninitialized values in node_probe() [ Upstream commit 2505a210fc126599013aec2be741df20aaacc490 ] If fw_csr_string() returns -ENOENT, then "name" is uninitialized. So then the "strlen(model_names[i]) <= name_len" is true because strlen() is unsigned and -ENOENT is type promoted to a very high positive value. Then the "strncmp(name, model_names[i], name_len)" uses uninitialized data because "name" is uninitialized. Fixes: 92374e886c75 ("[media] firedtv: drop obsolete backend abstraction") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> media: exynos4-is: Add missed check for pinctrl_lookup_state() [ Upstream commit 18ffec750578f7447c288647d7282c7d12b1d969 ] fimc_md_get_pinctrl() misses a check for pinctrl_lookup_state(). Add the missed check to fix it. Fixes: 4163851f7b99 ("[media] s5p-fimc: Use pinctrl API for camera ports configuration]") Signed-off-by: Chuhong Yuan <[email protected]> Signed-off-by: Hans Verkuil <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Sasha Levin <[email protected]> xfs: fix reflink quota reservation accounting error [ Upstream commit 83895227aba1ade33e81f586aa7b6b1e143096a5 ] Quota reservations are supposed to account for the blocks that might be allocated due to a bmap btree split. Reflink doesn't do this, so fix this to make the quota accounting more accurate before we start rearranging things. Fixes: 862bb360ef56 ("xfs: reflink extents from one file to another") Signed-off-by: Darrick J. Wong <[email protected]> Reviewed-by: Brian Foster <[email protected]> Signed-off-by: Sasha Levin <[email protected]> PCI: Fix pci_cfg_wait queue locking problem [ Upstream commit 2a7e32d0547f41c5ce244f84cf5d6ca7fccee7eb ] The pci_cfg_wait queue is used to prevent user-space config accesses to devices while they are recovering from reset. Previously we used these operations on pci_cfg_wait: __add_wait_queue(&pci_cfg_wait, ...) __remove_wait_queue(&pci_cfg_wait, ...) wake_up_all(&pci_cfg_wait) The wake_up acquires the wait queue lock, but the add and remove do not. Originally these were all protected by the pci_lock, but cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock"), moved wake_up_all() outside pci_lock, so it could race with add/remove operations, which caused occasional kernel panics, e.g., during vfio-pci hotplug/unplug testing: Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000 Resolve this by using wait_event() instead of __add_wait_queue() and __remove_wait_queue(). The wait queue lock is held by both wait_event() and wake_up_all(), so it provides mutual exclusion. Fixes: cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock") Link: https://lore.kernel.org/linux-pci/[email protected]/T/#u Based-on: https://lore.kernel.org/linux-pci/[email protected]/ Based-on-patch-by: Xiang Zheng <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Tested-by: Xiang Zheng <[email protected]> Cc: Heyi Guo <[email protected]> Cc: Biaoxiang Ye <[email protected]> Signed-off-by: Sasha Levin <[email protected]> leds: core: Flush scheduled work for system suspend [ Upstream commit 302a085c20194bfa7df52e0fe684ee0c41da02e6 ] Sometimes LED won't be turned off by LED_CORE_SUSPENDRESUME flag upon system suspend. led_set_brightness_nopm() uses schedule_work() to set LED brightness. However, there's no guarantee that the scheduled work gets executed because no one flushes the work. So flush the scheduled work to make sure LED gets turned off. Signed-off-by: Kai-Heng Feng <[email protected]> Acked-by: Jacek Anaszewski <[email protected]> Fixes: 81fe8e5b73e3 ("leds: core: Add led_set_brightness_nosleep{nopm} functions") Signed-off-by: Pavel Machek <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm: panel: simple: Fix bpc for LG LB070WV8 panel [ Upstream commit a6ae2fe5c9f9fd355a48fb7d21c863e5b20d6c9c ] The LG LB070WV8 panel incorrectly reports a 16 bits per component value, while the panel uses 8 bits per component. Fix it. Fixes: dd0150026901 ("drm/panel: simple: Add support for LG LB070WV8 800x480 7" panel") Signed-off-by: Laurent Pinchart <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> drm/bridge: sil_sii8620: initialize return of sii8620_readb [ Upstream commit 02cd2d3144653e6e2a0c7ccaa73311e48e2dc686 ] clang static analysis flags this error sil-sii8620.c:184:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn] return ret; ^~~~~~~~~~ sii8620_readb calls sii8620_read_buf. sii8620_read_buf can return without setting its output pararmeter 'ret'. So initialize ret. Fixes: ce6e153f414a ("drm/bridge: add Silicon Image SiI8620 driver") Signed-off-by: Tom Rix <[email protected]> Reviewed-by: Laurent Pinchart <[email protected]> Reviewed-by: Andrzej Hajda <[email protected]> Signed-off-by: Sam Ravnborg <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]> scsi: scsi_debug: Add check for sdebug_max_queue during module init [ Upstream commit c87bf24cfb60bce27b4d2c7e56ebfd86fb9d16bb ] sdebug_max_queue should not exceed SDEBUG_CANQUEUE, otherwise crashes like this can be triggered by passing an out-of-range value: Hardware name: Huawei D06 /D06, BIOS Hisilicon D06 UEFI RC0 - V1.16.01 03/15/2019 pstate: 20400009 (nzCv daif +PAN -UAO BTYPE=--) pc : schedule_resp+0x2a4/0xa70 [scsi_debug] lr : schedule_resp+0x52c/0xa70 [scsi_debug] sp : ffff800022ab36f0 x29: ffff800022ab36f0 x28: ffff0023a935a610 x27: ffff800008e0a648 x26: 0000000000000003 x25: ffff0023e84f3200 x24: 00000000003d0900 x23: 0000000000000000 x22: 0000000000000000 x21: ffff0023be60a320 x20: ffff0023be60b538 x19: ffff800008e13000 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: 0000000000000000 x9 : 0000000000000001 x8 : 0000000000000000 x7 : 0000000000000000 x6 : 00000000000000c1 x5 : 0000020000200000 x4 : dead0000000000ff x3 : 0000000000000200 x2 : 0000000000000200 x1 : ffff800008e13d88 x0 : 0000000000000000 Call trace: schedule_resp+0x2a4/0xa70 [scsi_debug] scsi_debug_queuecommand+0x2c4/0x9e0 [scsi_debug] scsi_queue_rq+0x698/0x840 __blk_mq_try_issue_directly+0x108/0x228 blk_mq_request_issue_directly+0x58/0x98 blk_mq_try_issue_list_directly+0x5c/0xf0 blk_mq_sched_insert_requests+0x18c/0x200 blk_mq_flush_plug_list+0x11c/0x190 blk_flush_plug_list+0xdc/0x110 blk_finish_plug+0x38/0x210 blkdev_direct_IO+0x450/0x4d8 generic_file_read_iter+0x84/0x180 blkdev_read_iter+0x3c/0x50 aio_read+0xc0/0x170 io_submit_one+0x5c8/0xc98 __arm64_sys_io_submit+0x1b0/0x258 el0_svc_common.constprop.3+0x68/0x170 do_el0_svc+0x24/0x90 el0_sync_handler+0x13c/0x1a8 el0_sync+0x158/0x180 Code: 528847e0 72a001e0 6b00003f 540018cd (3941c340) In addition, it should not be less than 1. So add checks for these, and fail the module init for those cases. [mkp: changed if condition to match error message] Link: https://lore.kernel.org/r/[email protected] Fixes: c483739430f1 ("scsi_debug: add multiple queue support") Reviewed-by: Ming Lei <[email protected]> Acked-by: Douglas Gilbert <[email protected]> Signed-off-by: John Garry <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> mwifiex: Prevent memory corruption handling keys [ Upstream commit e18696786548244914f36ec3c46ac99c53df99c3 ] The length of the key comes from the network and it's a 16 bit number. It needs to be capped to prevent a buffer overflow. Fixes: 5e6e3a92b9a4 ("wireless: mwifiex: initial commit for Marvell mwifiex driver") Signed-off-by: Dan Carpenter <[email protected]> Acked-by: Ganapathi Bhat <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/20200708115857.GA13729@mwanda Signed-off-by: Sasha Levin <[email protected]> powerpc/vdso: Fix vdso cpu truncation [ Upstream commit a9f675f950a07d5c1dbcbb97aabac56f5ed085e3 ] The code in vdso_cpu_init that exposes the cpu and numa node to userspace via SPRG_VDSO incorrctly masks the cpu to 12 bits. This means that any kernel running on a box with more than 4096 threads (NR_CPUS advertises a limit of of 8192 cpus) would expose userspace to two cpu contexts running at the same time with the same cpu number. Note: I'm not aware of any distro shipping a kernel with support for more than 4096 threads today, nor of any system image that currently exceeds 4096 threads. Found via code browsing. Fixes: 18ad51dd342a7eb09dbcd059d0b451b616d4dafc ("powerpc: Add VDSO version of getcpu") Signed-off-by: Milton Miller <[email protected]> Signed-off-by: Anton Blanchard <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> staging: rtl8192u: fix a dubious looking mask before a shift [ Upstream commit c4283950a9a4d3bf4a3f362e406c80ab14f10714 ] Currently the masking of ret with 0xff and followed by a right shift of 8 bits always leaves a zero result. It appears the mask of 0xff is incorrect and should be 0xff00, but I don't have the hardware to test this. Fix this to mask the upper 8 bits before shifting. [ Not tested ] Addresses-Coverity: ("Operands don't affect result") Fixes: 8fc8598e61f6 ("Staging: Added Realtek rtl8192u driver to staging") Signed-off-by: Colin Ian King <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> PCI/ASPM: Add missing newline in sysfs 'policy' [ Upstream commit 3167e3d340c092fd47924bc4d23117a3074ef9a9 ] When I cat ASPM parameter 'policy' by sysfs, it displays as follows. Add a newline for easy reading. Other sysfs attributes already include a newline. [root@localhost ~]# cat /sys/module/pcie_aspm/parameters/policy [default] performance powersave powersupersave [root@localhost ~]# Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Xiongfeng Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Sasha Levin <[email protected]> drm/imx: tve: fix regulator_disable error path [ Upstream commit 7bb58b987fee26da2a1665c01033022624986b7c ] Add missing regulator_disable() as devm_action to avoid dedicated unbind() callback and fix the missing error handling. Fixes: fcbc51e54d2a ("staging: drm/imx: Add support for Television Encoder (TVEv2)") Signed-off-by: Marco Felsch <[email protected]> Signed-off-by: Philipp Zabel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> USB: serial: iuu_phoenix: fix led-activity helpers [ Upstream commit de37458f8c2bfc465500a1dd0d15dbe96d2a698c ] The set-led command is eight bytes long and starts with a command byte followed by six bytes of RGB data and ends with a byte encoding a frequency (see iuu_led() and iuu_rgbf_fill_buffer()). The led activity helpers had a few long-standing bugs which corrupted the command packets by inserting a second command byte and thereby offsetting the RGB data and dropping the frequency in non-xmas mode. In xmas mode, a related off-by-one error left the frequency field uninitialised. Fixes: 60a8fc017103 ("USB: add iuu_phoenix driver") Reported-by: George Spelvin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Sasha Levin <[email protected]> thermal: ti-soc-thermal: Fix reversed condition in ti_thermal_expose_sensor() [ Upstream commit 0f348db01fdf128813fdd659fcc339038fb421a4 ] This condition is reversed and will cause breakage. Fixes: 7440f518dad9 ("thermal/drivers/ti-soc-thermal: Avoid dereferencing ERR_PTR") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Daniel Lezcano <[email protected]> Link: https://lore.kernel.org/r/20200616091949.GA11940@mwanda Signed-off-by: Sasha Levin <[email protected]> coresight: tmc: Fix TMC mode read in tmc_read_unprepare_etb() [ Upstream commit d021f5c5ff679432c5e9faee0fd7350db2efb97c ] Reading TMC mode register without proper coresight power management can lead to exceptions like the one in the call trace below in tmc_read_unprepare_etb() when the trace data is read after the sink is disabled. So fix this by having a check for coresight sysfs mode before reading TMC mode management register in tmc_read_unprepare_etb() similar to tmc_read_prepare_etb(). SError Interrupt on CPU6, code 0xbe000411 -- SError pstate: 80400089 (Nzcv daIf +PAN -UAO) pc : tmc_read_unprepare_etb+0x74/0x108 lr : tmc_read_unprepare_etb+0x54/0x108 sp : ffffff80d9507c30 x29: ffffff80d9507c30 x28: ffffff80b3569a0c x27: 0000000000000000 x26: 00000000000a0001 x25: ffffff80cbae9550 x24: 0000000000000010 x23: ffffffd07296b0f0 x22: ffffffd0109ee028 x21: 0000000000000000 x20: ffffff80d19e70e0 x19: ffffff80d19e7080 x18: 0000000000000000 x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: dfffffd000000001 x9 : 0000000000000000 x8 : 0000000000000002 x7 : ffffffd071d0fe78 x6 : 0000000000000000 x5 : 0000000000000080 x4 : 0000000000000001 x3 : ffffffd071d0fe98 x2 : 0000000000000000 x1 : 0000000000000004 x0 : 0000000000000001 Kernel panic - not syncing: Asynchronous SError Interrupt Fixes: 4525412a5046 ("coresight: tmc: making prepare/unprepare functions generic") Reported-by: Mike Leach <[email protected]> Signed-off-by: Sai Prakash Ranjan <[email protected]> Tested-by: Mike Leach <[email protected]> Signed-off-by: Mathieu Poirier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Sasha Levin <[email protected]> MIPS: OCTEON: add missing put_device() call in dwc3_octeon_device_init() [ Upstream commit e8b9fc10f2615b9a525fce56981e40b489528355 ] if of_find_device_by_node() succeed, dwc3_octeon_device_init() doesn't have a corresponding put_device(). Thus add put_device() to fix the exception handling for this function implementation. Fixes: 93e502b3c2d4 ("MIPS: OCTEON: Platform support for OCTEON III USB controller") Signed-off-by: Yu Kuai <[email protected]> Signed-off-by: Thomas Bogendoerfer <[email protected]> Signed-off-by: Sasha Levin <[email protected]> usb: dwc2: Fix error path in gadget registration [ Upstream commit 33a06f1300a79cfd461cea0268f05e969d4f34ec ] When gadget registration fails, one should not call usb_del_gadget_udc(). Ensure this by setting gadget->udc to NULL. Also in case of a failure there is no need to disable low-level hardware, so return immiedetly instead of jumping to error_init label. This fixes the following kernel NULL ptr dereference on gadget failure (can be easily triggered with g_mass_storage without any module parameters): dwc2 12480000.hsotg: dwc2_check_params: Invalid parameter besl=1 dwc2 12480000.hsotg: dwc2_check_params: Invalid parameter g_np_tx_fifo_size=1024 dwc2 12480000.hsotg: EPs: 16, dedicated fifos, 7808 entries in SPRAM Mass Storage Function, version: 2009/09/11 LUN: removable file: (no medium) no file given for LUN0 g_mass_storage 12480000.hsotg: failed to start g_mass_storage: -22 8<--- cut here --- Unable to handle kernel NULL pointer dereference at virtual address 00000104 pgd = (ptrval) [00000104] *pgd=00000000 Internal error: Oops: 805 [#1] PREEMPT SMP ARM Modules linked in: CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.8.0-rc5 #3133 Hardware name: Samsung Exynos (Flattened Device Tree) Workqueue: events deferred_probe_work_func PC is at usb_del_gadget_udc+0x38/0xc4 LR is at __mutex_lock+0x31c/0xb18 ... Process kworker/0:1 (pid: 12, stack limit = 0x(ptrval)) Stack: (0xef121db0 to 0xef122000) ... [<c076bf3c>] (usb_del_gadget_udc) from [<c0726bec>] (dwc2_hsotg_remove+0x10/0x20) [<c0726bec>] (dwc2_hsotg_remove) from [<c0711208>] (dwc2_driver_probe+0x57c/0x69c) [<c0711208>] (dwc2_driver_probe) from [<c06247c0>] (platform_drv_probe+0x6c/0xa4) [<c06247c0>] (platform_drv_probe) from [<c0621df4>] (really_probe+0x200/0x48c) [<c0621df4>] (really_probe) from [<c06221e8>] (driver_probe_device+0x78/0x1fc) [<c06221e8>] (driver_probe_device) from [<c061fcd4>] (bus_for_each_drv+0x74/0xb8) [<c061fcd4>] (bus_for_each_drv) from [<c0621b54>] (__device_attach+0xd4/0x16c) [<c0621b54>] (__device_attach) from [<c0620c98>] (bus_probe_device+0x88/0x90) [<c0620c98>] (bus_probe_device) from [<c06211b0>] (deferred_probe_work_func+0x3c/0xd0) [<c06211b0>] (deferred_probe_work_func) from [<c0149280>] (process_one_work+0x234/0x7dc) [<c0149280>] (process_one_work) from [<c014986c>] (worker_thread+0x44/0x51c) [<c014986c>] (worker_thread) from [<c0150b1c>] (kthread+0x158/0x1a0) [<c0150b1c>] (kthread) from [<c0100114>] (ret_from_fork+0x14/0x20) Exception stack(0xef121fb0 to 0xef121ff8) ... ---[ end trace 9724c2fc7cc9c982 ]--- While fixing this also fix the double call to dwc2_lowlevel_hw_disable() if dr_mode is set to USB_DR_MODE_PERIPHERAL. In such case low-level hardware is already disabled before calling usb_add_gadget_udc(). That function correctly preserves low-level hardware state, there is no need for the second unconditional dwc2_lowlevel_hw_disable() call. Fixes: 207324a321a8 ("usb: dwc2: Postponed gadget registration to the udc class driver") Acked-by: Minas Harutyunyan <[email protected]> Signed-off-by: Marek Szyprowski <[email protected]> Signed-off-by: Felipe Balbi <[email protected]> Signed-off-by: Sasha Levin <[email protected]> scsi: mesh: Fix panic after host or bus reset [ Upstream commit edd7dd2292ab9c3628b65c4d04514c3068ad54f6 ] Booting Linux with a Conner CP3200 drive attached to the MESH SCSI bus results in EH measures and a panic: [ 25.499838] mesh: configured for synchronous 5 MB/s [ 25.787154] mesh: performing initial bus reset... [ 29.867115] scsi host0: MESH [ 29.929527] mesh: target 0 synchronous at 3.6 MB/s [ 29.998763] scsi 0:0:0:0: Direct-Access CONNER CP3200-200mb-3.5 4040 PQ: 0 ANSI: 1 CCS [ 31.989975] sd 0:0:0:0: [sda] 415872 512-byte logical blocks: (213 MB/203 MiB) [ 32.070975] sd 0:0:0:0: [sda] Write Protect is off [ 32.137197] sd 0:0:0:0: [sda] Mode Sense: 5b 00 00 08 [ 32.209661] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 32.332708] sda: [mac] sda1 sda2 sda3 [ 32.417733] sd 0:0:0:0: [sda] Attached SCSI disk ... snip ... [ 76.687067] mesh_abort((ptrval)) [ 76.743606] mesh: state at (ptrval), regs at (ptrval), dma at (ptrval) [ 76.810798] ct=6000 seq=86 bs=4017 fc= 0 exc= 0 err= 0 im= 7 int= 0 sp=85 [ 76.880720] dma stat=84e0 cmdptr=1f73d000 [ 76.941387] phase=4 msgphase=0 conn_tgt=0 data_ptr=24576 [ 77.005567] dma_st=1 dma_ct=0 n_msgout=0 [ 77.065456] target 0: req=(ptrval) goes_out=0 saved_ptr=0 [ 77.130512] mesh_abort((ptrval)) [ 77.187670] mesh: state at (ptrval), regs at (ptrval), dma at (ptrval) [ 77.255594] ct=6000 seq=86 bs=4017 fc= 0 exc= 0 err= 0 im= 7 int= 0 sp=85 [ 77.325778] dma stat=84e0 cmdptr=1f73d000 [ 77.387239] phase=4 msgphase=0 conn_tgt=0 data_ptr=24576 [ 77.453665] dma_st=1 dma_ct=0 n_msgout=0 [ 77.515900] target 0: req=(ptrval) goes_out=0 saved_ptr=0 [ 77.582902] mesh_host_reset [ 88.187083] Kernel panic - not syncing: mesh: double DMA start ! [ 88.254510] CPU: 0 PID: 358 Comm: scsi_eh_0 Not tainted 5.6.13-pmac #1 [ 88.323302] Call Trace: [ 88.378854] [e16ddc58] [c0027080] panic+0x13c/0x308 (unreliable) [ 88.446221] [e16ddcb8] [c02b2478] mesh_start.part.12+0x130/0x414 [ 88.513298] [e16ddcf8] [c02b2fc8] mesh_queue+0x54/0x70 [ 88.577097] [e16ddd18] [c02a1848] scsi_send_eh_cmnd+0x374/0x384 [ 88.643476] [e16dddc8] [c02a1938] scsi_eh_tur+0x5c/0xb8 [ 88.707878] [e16dddf8] [c02a1ab8] scsi_eh_test_devices+0x124/0x178 [ 88.775663] [e16dde28] [c02a2094] scsi_eh_ready_devs+0x588/0x8a8 [ 88.843124] [e16dde98] [c02a31d8] scsi_error_handler+0x344/0x520 [ 88.910697] [e16ddf08] [c00409c8] kthread+0xe4/0xe8 [ 88.975166] [e16ddf38] [c000f234] ret_from_kernel_thread+0x14/0x1c [ 89.044112] Rebooting in 180 seconds.. In theory, a panic can happen after a bus or host reset with dma_started flag set. Fix this by halting the DMA before reinitializing the host. Don't assume that ms->current_req is set when halt_dma() is invoked as it may not hold for bus or host reset. BTW, this particular Conner drive can be made to work by inhibiting disconnect/reselect with 'mesh.resel_targets=0'. Link: https://lore.kernel.org/r/3952bc691e150a7128b29120999b6092071b039a.1595460351.git.fthain@telegraphics.com.au Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: Paul Mackerras <[email protected]> Reported-and-tested-by: Stan Johnson <[email protected]> Signed-off-by: Finn Thain <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]> net: dsa: mv88e6xxx: MV88E6097 does not support jumbo configuration [ Upstream commit 0f3c66a3c7b4e8b9f654b3c998e9674376a51b0f ] The MV88E6097 chip does not support configuring jumbo frames. Prior to commit 5f4366660d65 only the 6352, 6351, 6165 and 6320 chips configured jumbo mode. The refactor accidentally added the function for the 6097. Remove the erroneous function pointer assignment. Fixes: 5f4366660d65 ("net: dsa: mv88e6xxx: Refactor setting of jumbo frames") Signed-off-by: Chris Packham <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Smack: fix another vsscanf out of bounds [ Upstream commit a6bd4f6d9b07452b0b19842044a6c3ea384b0b88 ] This is similar to commit 84e99e58e8d1 ("Smack: slab-out-of-bounds in vsscanf") where we added a bounds check on "rule". Reported-by: [email protected] Fixes: f7112e6c9abf ("Smack: allow for significantly longer Smack labels v4") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Smack: prevent underflow in smk_set_cipso() [ Upstream commit 42a2df3e829f3c5562090391b33714b2e2e5ad4a ] We have an upper bound on "maplevel" but forgot to check for negative values. Fixes: e114e473771c ("Smack: Simplified Mandatory Access Control Kernel") Signed-off-by: Dan Carpenter <[email protected]> Signed-off-by: Casey Schaufler <[email protected]> Signed-off-by: Sasha Levin <[email protected]> power: supply: check if calc_soc succeeded in pm860x_init_battery [ Upstream commit ccf193dee1f0fff55b556928591f7818bac1b3b1 ] clang static analysis flags this error 88pm860x_battery.c:522:19: warning: Assigned value is garbage or undefined [core.uninitialized.Assign] info->start_soc = soc; ^ ~~~ soc is set by calling calc_soc. But calc_soc can return without setting soc. So check the return status and bail similarly to other checks in pm860x_init_battery and initialize soc to silence the warning. Fixes: a830d28b48bf ("power_supply: Enable battery-charger for 88pm860x") Signed-off-by: Tom Rix <[email protected]> Signed-off-by: Sebastian Reichel <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Bluetooth: hci_serdev: Only unregister device if it was registered [ Upstream commit 202798db9570104728dce8bb57dfeed47ce764bc ] We should not call hci_unregister_dev if the device was not successfully registered. Fixes: c34dc3bfa7642fd ("Bluetooth: hci_serdev: Introduce hci_uart_unregister_device()") Signed-off-by: Nicolas Boichat <[email protected]> Signed-off-by: Marcel Holtmann <[email protected]> Signed-off-by: Sasha Levin <[email protected]> selftests/powerpc: Fix CPU affinity for child process [ Upstream commit 854eb5022be04f81e318765f089f41a57c8e5d83 ] On systems with large number of cpus, test fails trying to set affinity by calling sched_setaffinity() with smaller size for affinity mask. This patch fixes it by making sure that the size of allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs(). Fixes: 00b7ec5c9cf3 ("selftests/powerpc: Import Anton's context_switch2 benchmark") Reported-by: Shirisha Ganta <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Signed-off-by: Harish <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Reviewed-by: Satheesh Rajendran <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> PCI: Release IVRS table in AMD ACS quirk [ Upstream commit 090688fa4e448284aaa16136372397d7d10814db ] The acpi_get_table() should be coupled with acpi_put_table() if the mapped table is not used at runtime to release the table mapping. In pci_quirk_amd_sb_acs(), IVRS table is just used for checking AMD IOMMU is supported, not used at runtime, so put the table after using it. Fixes: 15b100dfd1c9 ("PCI: Claim ACS support for AMD southbridge devices") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Hanjun Guo <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Signed-off-by: Sasha Levin <[email protected]> selftests/powerpc: Fix online CPU selection [ Upstream commit dfa03fff86027e58c8dba5c03ae68150d4e513ad ] The size of the CPU affinity mask must be large enough for systems with a very large number of CPUs. Otherwise, tests which try to determine the first online CPU by calling sched_getaffinity() will fail. This makes sure that the size of the allocated affinity mask is dependent on the number of CPUs as reported by get_nprocs_conf(). Fixes: 3752e453f6ba ("selftests/powerpc: Add tests of PMU EBBs") Reported-by: Shirisha Ganta <[email protected]> Signed-off-by: Sandipan Das <[email protected]> Reviewed-by: Kamalesh Babulal <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/a408c4b8e9a23bb39b539417a21eb0ff47bb5127.1596084858.git.sandipan@linux.ibm.com Signed-off-by: Sasha Levin <[email protected]> s390/qeth: don't process empty bridge port events [ Upstream commit 02472e28b9a45471c6d8729ff2c7422baa9be46a ] Discard events that don't contain any entries. This shouldn't happen, but subsequent code relies on being able to use entry 0. So better be safe than accessing garbage. Fixes: b4d72c08b358 ("qeth: bridgeport support - basic control") Signed-off-by: Julian Wiedmann <[email protected]> Reviewed-by: Alexandra Winter <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> wl1251: fix always return 0 error [ Upstream commit 20e6421344b5bc2f97b8e2db47b6994368417904 ] wl1251_event_ps_report() should not always return 0 because wl1251_ps_set_mode() may fail. Change it to return 'ret'. Fixes: f7ad1eed4d4b ("wl1251: retry power save entry") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: Kalle Valo <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]> tools, build: Propagate build failures from tools/build/Makefile.build [ Upstream commit a278f3d8191228212c553a5d4303fa603214b717 ] The '&&' command seems to have a bad effect when $(cmd_$(1)) exits with non-zero effect: the command failure is masked (despite `set -e`) and all but the first command of $(dep-cmd) is executed (successfully, as they are mostly printfs), thus overall returning 0 in the end. This means in practice that despite compilation errors, tools's build Makefile will return success. We see this very reliably with libbpf's Makefile, which doesn't get compilation error propagated properly. This in turns causes issues with selftests build, as well as bpftool and other projects that rely on building libbpf. The fix is simple: don't use &&. Given `set -e`, we don't need to chain commands with &&. The shell will exit on first failure, giving desired behavior and propagating error properly. Fixes: 275e2d95591e ("tools build: Move dependency copy into function") Signed-off-by: Andrii Nakryiko <[email protected]> Signed-off-by: Daniel Borkmann <[email protected]> Acked-by: Jiri Olsa <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]> net: ethernet: aquantia: Fix wrong return value [ Upstream commit 0470a48880f8bc42ce26962b79c7b802c5a695ec ] In function hw_atl_a0_hw_multicast_list_set(), when an invalid request is encountered, a negative error code should be returned. Fixes: bab6de8fd180b ("net: ethernet: aquantia: Atlantic A0 and B0 specific functions") Cc: David VomLehn <[email protected]> Signed-off-by: Tianjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> liquidio: Fix wrong return value in cn23xx_get_pf_num() [ Upstream commit aa027850a292ea65524b8fab83eb91a124ad362c ] On an error exit path, a negative error code should be returned instead of a positive return value. Fixes: 0c45d7fe12c7e ("liquidio: fix use of pf in pass-through mode in a virtual machine") Cc: Rick Farrington <[email protected]> Signed-off-by: Tianjia Zhang <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> net: spider_net: Fix the size used in a 'dma_free_coherent()' call [ Upstream commit 36f28f7687a9ce665479cce5d64ce7afaa9e77ae ] Update the size used in 'dma_free_coherent()' in order to match the one used in the corresponding 'dma_alloc_coherent()', in 'spider_net_init_chain()'. Fixes: d4ed8f8d1fb7 ("Spidernet DMA coalescing") Signed-off-by: Christophe JAILLET <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: use 32-bit unsigned integer [ Upstream commit 99f47abd9f7bf6e365820d355dc98f6955a562df ] Potentially overflowing expression (ts_freq << 16 and intgr << 16) declared as type u32 (32-bit unsigned) is evaluated using 32-bit arithmetic and then used in a context that expects an expression of type u64 (64-bit unsigned) which ultimately is used as 16-bit unsigned by typecasting to u16. Fixed by using an unsigned 32-bit integer since the value is truncated anyway in the end. Fixes: 414fd46e7762 ("fsl/fman: Add FMan support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix dereference null return value [ Upstream commit 0572054617f32670abab4b4e89a876954d54b704 ] Check before using returned value to avoid dereferencing null pointer. Fixes: 18a6c85fcc78 ("fsl/fman: Add FMan Port Support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix unreachable code [ Upstream commit cc79fd8f557767de90ff199d3b6fb911df43160a ] The parameter 'priority' is incorrectly forced to zero which ultimately induces logically dead code in the subsequent lines. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: check dereferencing null pointer [ Upstream commit cc5d229a122106733a85c279d89d7703f21e4d4f ] Add a safe check to avoid dereferencing null pointer Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> fsl/fman: fix eth hash table allocation [ Upstream commit 3207f715c34317d08e798e11a10ce816feb53c0f ] Fix memory allocation for ethernet address hash table. The code was wrongly allocating an array for eth hash table which is incorrect because this is the main structure for eth hash table (struct eth_hash_t) that contains inside a number of elements. Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support") Signed-off-by: Florinel Iordache <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]> dlm: Fix kobject memleak [ Upstream commit 0ffddafc3a3970ef7013696e7f36b3d378bc4c16 ] Currently the error return path from kobject_init_and_add() is not followed by a call to kobject_put() - which means we are leaking the kobject. Set do_unreg = 1 before kobject_init_and_add() to ensure that kobject_put() can be called in its error patch. Fixes: 901195ed7f4b ("Kobject: change GFS2 to use kobject_init_and_add") Reported-by: Hulk Robot <[email protected]> Signed-off-by: Wang Hai <[email protected]> Signed-off-by: David Teigland <[email protected]> Signed-off-by: Sasha Levin <[email protected]> pinctrl-single: fix pcs_parse_pinconf() return value [ Upstream commit f46fe79ff1b65692a65266a5bec6dbe2bf7fc70f ] This patch causes pcs_parse_pinconf() to return -ENOTSUPP when no pinctrl_map is added. The current behavior is to return 0 when !PCS_HAS_PINCONF or !nconfs. Thus pcs_parse_one_pinctrl_entry() incorrectly assumes that a map was added and sets num_maps = 2. Analysis: ========= The function pcs_parse_one_pinctrl_entry() calls pcs_parse_pinconf() if PCS_HAS_PINCONF is enabled. The function pcs_parse_pinconf() returns 0 to indicate there was no error and num_maps is then set to 2: 980 static int pcs_parse_one_pinctrl_entry(struct pcs_device *pcs, 981 struct device_node *np, 982 struct pinctrl_map **map, 983 unsigned *num_maps, 984 const char **pgnames) 985 { <snip> 1053 (*map)->type = PIN_MAP_TYPE_MUX_GROUP; 1054 (*map)->data.mux.group = np->name; 1055 (*map)->data.mux.function = np->name; 1056 1057 if (PCS_HAS_PINCONF && function) { 1058 res = pcs_parse_pinconf(pcs, np, function, map); 1059 if (res) 1060 goto free_pingroups; 1061 *num_maps = 2; 1062 } else { 1063 *num_maps = 1; 1064 } However, pcs_parse_pinconf() will also return 0 if !PCS_HAS_PINCONF or !nconfs. I believe these conditions should indicate that no map was added by returning -ENOTSUPP. Otherwise pcs_parse_one_pinctrl_entry() will set num_maps = 2 even though no maps were successfully added, as it does not reach "m++" on line 940: 895 static int pcs_parse_pinconf(struct pcs_device *pcs, struct device_node *np, 896 struct pcs_function *func, 897 struct pinctrl_map **map) 898 899 { 900 struct pinctrl_map *m = *map; <snip> 917 /* If pinconf isn't supported, don't parse properties in below. */ 918 if (!PCS_HAS_PINCONF) 919 return 0; 920 921 /* cacluate how much properties are supported in current node */ 922 for (i = 0; i < ARRAY_SIZE(prop2); i++) { 923 if (of_find_property(np, prop2[i].name, NULL)) 924 nconfs++; 925 } 926 for (i = 0; i < ARRAY_SIZE(prop4); i++) { 927 if (of_find_property(np, prop4[i].name, NULL)) 928 nconfs++; 929 } 930 if (!nconfs) 919 return 0; 932 933 func->conf = devm_kcalloc(pcs->dev, 934 nconfs, sizeof(struct pcs_conf_vals), 935 GFP_KERNEL); 936 if (!func->conf) 937 return -ENOMEM; 938 func->nconfs = nconfs; 939 conf = &(func->conf[0]); 940 m++; This situtation will cause a boot failure [0] on the BeagleBone Black (AM3358) when am33xx_pinmux node in arch/arm/boot/dts/am33xx-l4.dtsi has compatible = "pinconf-single" instead of "pinctrl-single". The patch fixes this issue by returning -ENOSUPP when !PCS_HAS_PINCONF or !nconfs, so that pcs_parse_one_pinctrl_entry() will know that no map was added. Logic is also added to pcs_parse_one_pinctrl_entry() to distinguish between -ENOSUPP and other errors. In the case of -ENOSUPP, num_maps is set to 1 as it is valid for pinconf to be enabled and a given pin group to not any pinconf properties. [0] https://lore.kernel.org/linux-omap/20200529175544.GA3766151@x1/ Fixes: 9dddb4df90d1 ("pinctrl: single: support generic pinconf") Signed-off-by: Drew Fustini <[email protected]> Acked-by: Tony Lindgren <[email protected]> Link: https://lore.kernel.org/r/20200608125143.GA2789203@x1 Signed-off-by: Linus Walleij <[email protected]> Signed-off-by: Sasha Levin <[email protected]> x86/fsgsbase/64: Fix NULL deref in 86_fsgsbase_read_task [ Upstream commit 8ab49526b53d3172d1d8dd03a75c7d1f5bd21239 ] syzbot found its way in 86_fsgsbase_read_task() and triggered this oops: KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] CPU: 0 PID: 6866 Comm: syz-executor262 Not tainted 5.8.0-syzkaller #0 RIP: 0010:x86_fsgsbase_read_task+0x16d/0x310 arch/x86/kernel/process_64.c:393 Call Trace: putreg32+0x3ab/0x530 arch/x86/kernel/ptrace.c:876 genregs32_set arch/x86/kernel/ptrace.c:1026 [inline] genregs32_set+0xa4/0x100 arch/x86/kernel/ptrace.c:1006 copy_regset_from_user include/linux/regset.h:326 [inline] ia32_arch_ptrace arch/x86/kernel/ptrace.c:1061 [inline] compat_arch_ptrace+0x36c/0xd90 arch/x86/kernel/ptrace.c:1198 __do_compat_sys_ptrace kernel/ptrace.c:1420 [inline] __se_compat_sys_ptrace kernel/ptrace.c:1389 [inline] __ia32_compat_sys_ptrace+0x220/0x2f0 kernel/ptrace.c:1389 do_syscall_32_irqs_on arch/x86/entry/common.c:84 [inline] __do_fast_syscall_32+0x57/0x80 arch/x86/entry/common.c:126 do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:149 entry_SYSENTER_compat_after_hwframe+0x4d/0x5c This can happen if ptrace() or sigreturn() pokes an LDT selector into FS or GS for a task with no LDT and something tries to read the base before a return to usermode notices the bad selector and fixes it. The fix is to make sure ldt pointer is not NULL. Fixes: 07e1d88adaae ("x86/fsgsbase/64: Fix ptrace() to read the FS/GS base accurately") Co-developed-by: Jann Horn <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Reported-by: syzbot <[email protected]> Acked-by: Andy Lutomirski <[email protected]> Cc: Chang S. Bae <[email protected]> Cc: Andy Lutomirski <[email protected]> Cc: Borislav Petkov <[email protected]> Cc: Brian Gerst <[email protected]> Cc: Dave Hansen <[email protected]> Cc: Denys Vlasenko <[email protected]> Cc: H. Peter Anvin <[email protected]> Cc: Markus T Metzger <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Shankar <[email protected]> Cc: Rik van Riel <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Ingo Molnar <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]> crypto: aesni - add compatibility with IAS [ Upstream commit 44069737ac9625a0f02f0f7f5ab96aae4cd819bc ] Clang's integrated assembler complains "invalid reassignment of non-absolute variable 'var_ddq_add'" while assembling arch/x86/crypto/aes_ctrby8_avx-x86_64.S. It was because var_ddq_add was reassigned with non-absolute values several times, which IAS did not support. We can avoid the reassignment by replacing the uses of var_ddq_add with its definitions accordingly to have compatilibility with IAS. Link: https://github.com/ClangBuiltLinux/linux/issues/1008 Reported-by: Sedat Dilek <[email protected]> Reported-by: Fangrui Song <[email protected]> Tested-by: Sedat Dilek <[email protected]> # build+boot Linux v5.7.5; clang v11.0.0-git Signed-off-by: Jian Cai <[email protected]> Signed-off-by: Herbert Xu <[email protected]> Signed-off-by: Sasha Levin <[email protected]> af_packet: TPACKET_V3: fix fill status rwlock imbalance [ Upstream commit 88fd1cb80daa20af063bce81e1fad14e945a8dc4 ] After @blk_fill_in_prog_lock is acquired there is an early out vnet situation that can occur. In that case, the rwlock needs to be released. Also, since @blk_fill_in_prog_lock is only acquired when @tp_version is exactly TPACKET_V3, only release it on that exact condition as well. And finally, add sparse annotation so that it is clearer that prb_fill_curr_block() and prb_clear_blk_fill_status() are acquiring and releasing @blk_fill_in_prog_lock, respectively. sparse is still unable to understand the balance, but the warnings are now on a higher level that make more sense. Fixes: 632ca50f2cbd ("af_packet: TPACKET_V3: replace busy-wait loop") Signed-off-by: John Ogness <[email protected]> Reported-by: kernel test robot <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> drivers/net/wan/lapbether: Added needed_headroom and a skb->len check [ Upstream commit c7ca03c216acb14466a713fedf1b9f2c24994ef2 ] 1. Added a skb->len check This driver expects upper layers to include a pseudo header of 1 byte when passing down a skb for transmission. This driver will read this 1-byte header. This patch added a skb->len check before reading the header to make sure the header exists. 2. Changed to use needed_headroom instead of hard_header_len to request necessary headroom to be allocated In net/packet/af_packet.c, the function packet_snd first reserves a headroom of length (dev->hard_header_len + dev->needed_headroom). Then if the socket is a SOCK_DGRAM socket, it calls dev_hard_header, which calls dev->header_ops->create, to create the link layer header. If the socket is a SOCK_RAW socket, it "un-reserves" a headroom of length (dev->hard_header_len), and assumes the user to provide the appropriate link layer header. So according to the logic of af_packet.c, dev->hard_header_len should be the length of the header that would be created by dev->header_ops->create. However, this driver doesn't provide dev->header_ops, so logically dev->hard_header_len should be 0. So we should use dev->needed_headroom instead of dev->hard_header_len to request necessary headroom to be allocated. This change fixes kernel panic when this driver is used with AF_PACKET SOCK_RAW sockets. Call stack when panic: [ 168.399197] skbuff: skb_under_panic: text:ffffffff819d95fb len:20 put:14 head:ffff8882704c0a00 data:ffff8882704c09fd tail:0x11 end:0xc0 dev:veth0 ... [ 168.399255] Call Trace: [ 168.399259] skb_push.cold+0x14/0x24 [ 168.399262] eth_header+0x2b/0xc0 [ 168.399267] lapbeth_data_transmit+0x9a/0xb0 [lapbether] [ 168.399275] lapb_data_transmit+0x22/0x2c [lapb] [ 168.399277] lapb_transmit_buffer+0x71/0xb0 [lapb] [ 168.399279] lapb_kick+0xe3/0x1c0 [lapb] [ 168.399281] lapb_data_request+0x76/0xc0 [lapb] [ 168.399283] lapbeth_xmit+0x56/0x90 [lapbether] [ 168.399286] dev_hard_start_xmit+0x91/0x1f0 [ 168.399289] ? irq_init_percpu_irqstack+0xc0/0x100 [ 168.399291] __dev_queue_xmit+0x721/0x8e0 [ 168.399295] ? packet_parse_headers.isra.0+0xd2/0x110 [ 168.399297] dev_queue_xmit+0x10/0x20 [ 168.399298] packet_sendmsg+0xbf0/0x19b0 ...... Cc: Willem de Bruijn <[email protected]> Cc: Martin Schiller <[email protected]> Cc: Brian Norris <[email protected]> Signed-off-by: Xie He <[email protected]> Acked-by: Willem de Bruijn <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net/nfc/rawsock.c: add CAP_NET_RAW check. [ Upstream commit 26896f01467a28651f7a536143fe5ac8449d4041 ] When creating a raw AF_NFC socket, CAP_NET_RAW needs to be checked first. Signed-off-by: Qingyu Li <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: refactor bind_bucket fastreuse into helper [ Upstream commit 62ffc589abb176821662efc4525ee4ac0b9c3894 ] Refactor the fastreuse update code in inet_csk_get_port into a small helper function that can be called from other places. Acked-by: Matthieu Baerts <[email protected]> Signed-off-by: Tim Froidcoeur <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> net: Set fput_needed iff FDPUT_FPUT is set [ Upstream commit ce787a5a074a86f76f5d3fd804fa78e01bfb9e89 ] We should fput() file iff FDPUT_FPUT is set. So we should set fput_needed accordingly. Fixes: 00e188ef6a7e ("sockfd_lookup_light(): switch to fdget^W^Waway from fget_light") Signed-off-by: Miaohe Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: serial: cp210x: re-enable auto-RTS on open commit c7614ff9b73a1e6fb2b1b51396da132ed22fecdb upstream. CP210x hardware disables auto-RTS but leaves auto-CTS when in hardware flow control mode and UART on cp210x hardware is disabled. When re-opening the port, if auto-CTS is enabled on the cp210x, then auto-RTS must be re-enabled in the driver. Signed-off-by: Brant Merryman <[email protected]> Co-developed-by: Phu Luu <[email protected]> Signed-off-by: Phu Luu <[email protected]> Link: https://lore.kernel.org/r/[email protected] [ johan: fix up tags and problem description ] Fixes: 39a66b8d22a3 ("[PATCH] USB: CP2101 Add support for flow control") Cc: stable <[email protected]> # 2.6.12 Signed-off-by: Johan Hovold <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]> USB: serial: cp210x: enable usb generic throttle/unthrottle commit 4387b3dbb079d482d3c2b43a703ceed4dd27ed28 upstream. Assign the .throttle and .unthrottle functions to be generic function in the driver structure to prevent d…
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: John Vincent <[email protected]> Signed-off-by: John Vincent <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f ] YangYuxi is reporting that connection reuse is causing one-second delay when SYN hits existing connection in TIME_WAIT state. Such delay was added to give time to expire both the IPVS connection and the corresponding conntrack. This was considered a rare case at that time but it is causing problem for some environments such as Kubernetes. As nf_conntrack_tcp_packet() can decide to release the conntrack in TIME_WAIT state and to replace it with a fresh NEW conntrack, we can use this to allow rescheduling just by tuning our check: if the conntrack is confirmed we can not schedule it to different real server and the one-second delay still applies but if new conntrack was created, we are free to select new real server without any delays. YangYuxi lists some of the problem reports: - One second connection delay in masquerading mode: https://marc.info/?t=151683118100004&r=1&w=2 - IPVS low throughput #70747 kubernetes/kubernetes#70747 - Apache Bench can fill up ipvs service proxy in seconds #544 cloudnativelabs/kube-router#544 - Additional 1s latency in `host -> service IP -> pod` kubernetes/kubernetes#90854 Fixes: f719e37 ("ipvs: drop first packet to redirect conntrack") Co-developed-by: YangYuxi <[email protected]> Signed-off-by: YangYuxi <[email protected]> Signed-off-by: Julian Anastasov <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Pablo Neira Ayuso <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Uh oh!
There was an error while loading. Please reload this page.
I am not sure if I have something configured wrong but here is my Centos7 physical node and kube-router agent setup:
[ipvsadm package]
$ rpm -q ipvsadm
ipvsadm-1.27-7.el7.x86_64
[kube router process and options]
$ ps -ocommand= -C kube-router
/usr/local/bin/kube-router --run-router=true --run-firewall=true --run-service-proxy=true --kubeconfig=/etc/kubernetes/kube-router.kubeconfig --hostname-override=node6 --enable-overlay=true
[service]
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
test-svc NodePort 172.30.176.114 80:30530/TCP 7h
[ipvs]
$ ipvsadm -ln | head -n 1
IP Virtual Server version 1.2.1 (size=4096)
[ipvs service]
$ ipvsadm -ln | grep -A1 30530
TCP 10.200.1.146:30530 rr
-> 172.32.9.68:80 Masq 1 0 0
If I use apache bench with tcp keepalive all is swell and absurdly fast, posting over 10,000 requests per second and ipvsadm will show stats like below during such a test:
$ ipvsadm -ln | grep -A1 30530
TCP 10.200.1.146:30530 rr
-> 172.32.9.68:80 Masq 1 0 757
However if I run the same test without keep-alive then "InActConn" jumps up to 14000 within a few seconds and up until that point things are very fast, but after that point the virtual server just completely hangs up and stops responding to requests until "InActConn" drops back below 14000. This happens if I run apache bench on the node itself and hit the clusterIp, or if I run it from a random server and hit the nodeport.
---ipvs
$ ipvsadm -ln | grep -A1 30530
TCP 10.200.1.146:30530 rr
-> 172.32.9.68:80 Masq 1 0 14115
--- apache bench output
ab -c 100 -n 20000 http://node6:30530
This is ApacheBench, Version 2.3 <$Revision: 1826891 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/
Benchmarking node6 (be patient)
Completed 2000 requests
Completed 4000 requests
Completed 6000 requests
Completed 8000 requests
Completed 10000 requests
Completed 12000 requests
Completed 14000 requests
Completed 16000 requests
Completed 18000 requests
Completed 20000 requests
Finished 20000 requests
Server Software: Apache/2.4.34
Server Hostname: node6
Server Port: 30530
Document Path: /
Document Length: 2512 bytes
Concurrency Level: 100
Time taken for tests: 63.914 seconds
Complete requests: 20000
Failed requests: 0
Total transferred: 55860000 bytes
HTML transferred: 50240000 bytes
Requests per second: 312.92 [#/sec] (mean)
Time per request: 319.569 [ms] (mean)
Time per request: 3.196 [ms] (mean, across all concurrent requests)
Transfer rate: 853.50 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 1 311 456.6 10 1005
Processing: 1 7 4.5 8 36
Waiting: 0 7 4.5 8 36
Total: 2 318 453.1 19 1010
Percentage of the requests served within a certain time (ms)
50% 19
66% 21
75% 1003
80% 1004
90% 1004
95% 1005
98% 1005
99% 1006
100% 1010 (longest request)
The text was updated successfully, but these errors were encountered: