Skip to content

Wifi drop in a regular way #1166

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
poofyteddy opened this issue Sep 9, 2020 · 44 comments
Closed

Wifi drop in a regular way #1166

poofyteddy opened this issue Sep 9, 2020 · 44 comments
Labels
bug stale This issue will be closed soon because of prolonged inactivity

Comments

@poofyteddy
Copy link

Describe the bug
The wifi drop for chunk of about 20mn at a time, every ~40mn without visible reason.
I've seen other issue so this is what i have try
tested with and without HA integration
tested with and without led connected
tested with and without 'keep wifi from sleeping'
tested without doing anything to it.
tested with all other protocol disable (mqtt, upd sync and stuff)
tester only without NTP

The Esp doesn't reboot because the led effect are working during both the drop and re connection

To Reproduce
Flash a nodemcu esp8266 lolin with Wled official image, power on, enjoy ?

Expected behavior
No drop please :(

WLED version

  • Board: nodemcu esp8266 lolin
  • Version: 0.10.0 and 0.10.2
  • Format: Binary

Additional context
My wifi is based on unifi, on channel 1 because Nanoleaf was picky about it.
49665 packets transmitted, 31224 received, +17839 errors, 37.1308% packet loss, time 50154891ms
I am willing to try anything that can help you debug

Thank you for your help!

@poofyteddy poofyteddy added the bug label Sep 9, 2020
@huggy-d1
Copy link
Contributor

huggy-d1 commented Sep 9, 2020

The NodeMCU is not in a box, correct? No PIR sensor nearby? Try disabling things like Hue, Blynk, Alexa and such until you find a stable wifi connection.
I would drop back to WLED v0.8.6 and install that to rule out environmental issues. My understanding is that version seemed to work pretty well when it comes to wifi stability.

@pbolduc
Copy link
Contributor

pbolduc commented Sep 9, 2020

My wifi is based on unifi, on channel 1 because Nanoleaf was picky about it.

Does your Unifi Controller provide any additional information? I dont have time to login to mine right now, but it should show client drops,

@Aircoookie
Copy link
Member

Please check the Info page in the main UI for your signal strength. Anything below 60% tends to disappear randomly. Signal is a bit unreliable with ESPs...

@pbolduc
Copy link
Contributor

pbolduc commented Sep 9, 2020

This could be related esp8266/Arduino#5083 (comment) The recommendation is to use WiFi.mode(WIFI_NONE_SLEEP) however, digging into the source, it seems it should be WiFi.setSleepMode(WIFI_NONE_SLEEP). Wifi.mode accepts WiFiMode_t which has values: WIFI_OFF = 0, WIFI_STA = 1, WIFI_AP = 2, WIFI_AP_STA = 3. Whereas, setSleepMode takes WiFiSleepType_t which has values WIFI_NONE_SLEEP = 0, WIFI_LIGHT_SLEEP = 1, WIFI_MODEM_SLEEP = 2. From the docs,

   call order:
     wifi_set_sleep_level(MAX_SLEEP_T) (SDK3)
     wifi_set_listen_interval          (SDK3)
     wifi_set_sleep_type               (all SDKs)

from .platformio\packages\framework-arduinoespressif8266\libraries\ESP8266WiFi\src\ESP8266WiFiGeneric.cpp

   /**
    * datasheet:
    *
   wifi_set_sleep_level():
   Set sleep level of modem sleep and light sleep
   This configuration should be called before calling wifi_set_sleep_type
   Modem-sleep and light sleep mode have minimum and maximum sleep levels.
   - In minimum sleep level, station wakes up at every DTIM to receive
     beacon.  Broadcast data will not be lost because it is transmitted after
     DTIM.  However, it can not save much more power if DTIM period is short,
     as specified in AP.
   - In maximum sleep level, station wakes up at every listen interval to
     receive beacon.  Broadcast data may be lost because station may be in sleep
     state at DTIM time.  If listen interval is longer, more power will be saved, but
     it’s very likely to lose more broadcast data.
   - Default setting is minimum sleep level.
   Further reading: https://routerguide.net/dtim-interval-period-best-setting/

   wifi_set_listen_interval():
   Set listen interval of maximum sleep level for modem sleep and light sleep
   It only works when sleep level is set as MAX_SLEEP_T
   forum: https://github.com/espressif/ESP8266_NONOS_SDK/issues/165#issuecomment-416121920
   default value seems to be 3 (as recommended by https://routerguide.net/dtim-interval-period-best-setting/)

   call order:
     wifi_set_sleep_level(MAX_SLEEP_T) (SDK3)
     wifi_set_listen_interval          (SDK3)
     wifi_set_sleep_type               (all SDKs)

    */

@pbolduc
Copy link
Contributor

pbolduc commented Sep 9, 2020

Is this a duplicate of #424 ?

@poofyteddy
Copy link
Author

poofyteddy commented Sep 9, 2020

@huggy-d1 in a plastic box with vent (3d printed) and no pir, and it had the same beaviour when connected via usb only, so no power-supply noise. I have disable everything in sync (wled broadcast, udp realtime, dmx, alexa, blynk, mqtt, hue)
I will try to downgrade to v0.8.6 but i have to backup the preset done with my wife before hand or i am going to get slayed 😅

@Aircoookie Signal strength is a 44%, but it have happened on my desk 1m away from the AP, Still I'll try it before downgrading with something above 90%

@pbolduc as far as i know, my unify don't log client drop, but i can't see it in the device list, and can't ping it when it's gone.

Not the same timezone, it's close to midnight here, I'll try all of that tomorrow. Thank's !

I had seen #424 but didn't read it fully because it looked like people needed a hard reset to get wifi back, which wasn't my case.
Reading it now sadly didn't bring me anything new :(

@huggy-d1
Copy link
Contributor

I look forward to your post backup (is there an http or json API request returning returning all current settings?), full erase, v0.8.6 install and reconfigure to see how it goes.

@poofyteddy
Copy link
Author

poofyteddy commented Sep 10, 2020

is there an http or json API request returning returning all current settings?

This was announced for 10.0.0, but i can't see it.
Sadly it look like i can't set my wifi mesh with unifi back up (missing or moved option) in order to increase the WiFi coverage. Tomorrow arrive a POE switch that will get rid of this issue.

Keep you posted.

@pbolduc
Copy link
Contributor

pbolduc commented Sep 10, 2020

I tried last night to reproduce the issue. I also have Unifi system here. I could not reproduce the drop. Perhaps it is because I was hitting the /json/info endpoint via node-red to see if it dropped. Also, my instance is very close to an access point in my office. Here are my grafana graphs. The uptime drops are were I manually rebooted. I originally had MQTT enabled. I could try to reproduce by not making remote requests for a longer period.

image

@poofyteddy
Copy link
Author

Thank you for trying @pbolduc. As an aditional info, the drop happen wether i do something with the ui or not, and more that one time i was "kicked out" by the drop while playing with the light. so clearly it didn't went to sleep because of inactivity.

@pbolduc
Copy link
Contributor

pbolduc commented Sep 10, 2020

I setup my monitoring really quickly using docker. I will use this repo (requires a Rasp Pi, WSL2 or other linux env) https://github.com/gcgarner/IOTstack.git

  • install node-red
  • install grafana
  • install influxdb

then create a flow like this:

image

Hit me up on the WLED discord if you need help.

You can import this flow,

[
    {
        "id": "72cd5dfa.0ddb9c",
        "type": "tab",
        "label": "Flow 1",
        "disabled": false,
        "info": ""
    },
    {
        "id": "4dc81af5.4d7bac",
        "type": "inject",
        "z": "72cd5dfa.0ddb9c",
        "d": true,
        "name": "",
        "props": [
            {
                "p": "payload"
            },
            {
                "p": "topic",
                "vt": "str"
            }
        ],
        "repeat": "5",
        "crontab": "",
        "once": false,
        "onceDelay": 0.1,
        "topic": "",
        "payload": "",
        "payloadType": "date",
        "x": 190,
        "y": 40,
        "wires": [
            [
                "11264bf.036b1b4"
            ]
        ]
    },
    {
        "id": "11264bf.036b1b4",
        "type": "http request",
        "z": "72cd5dfa.0ddb9c",
        "name": "",
        "method": "GET",
        "ret": "obj",
        "paytoqs": "ignore",
        "url": "http://10.0.9.101/json/info",
        "tls": "",
        "persist": false,
        "proxy": "",
        "authType": "",
        "x": 380,
        "y": 40,
        "wires": [
            [
                "82da9191.3726d8",
                "d44200b3.b73e4"
            ]
        ]
    },
    {
        "id": "d44200b3.b73e4",
        "type": "debug",
        "z": "72cd5dfa.0ddb9c",
        "name": "",
        "active": false,
        "tosidebar": true,
        "console": false,
        "tostatus": false,
        "complete": "true",
        "targetType": "full",
        "statusVal": "",
        "statusType": "auto",
        "x": 550,
        "y": 100,
        "wires": []
    },
    {
        "id": "fab270c4.5de9",
        "type": "influxdb out",
        "z": "72cd5dfa.0ddb9c",
        "influxdb": "1705007.08cd08",
        "name": "",
        "measurement": "wled",
        "precision": "",
        "retentionPolicy": "",
        "x": 790,
        "y": 40,
        "wires": []
    },
    {
        "id": "82da9191.3726d8",
        "type": "function",
        "z": "72cd5dfa.0ddb9c",
        "name": "",
        "func": "\nconst payload = [{\n    uptime: msg.payload.uptime,\n    signal: msg.payload.wifi.signal,\n    rssi: msg.payload.wifi.rssi\n},\n{\n    macAddress: msg.payload.mac\n}];\n\nmsg.payload = payload;\nreturn msg;\n",
        "outputs": 1,
        "noerr": 0,
        "initialize": "",
        "finalize": "",
        "x": 560,
        "y": 40,
        "wires": [
            [
                "fab270c4.5de9"
            ]
        ]
    },
    {
        "id": "1705007.08cd08",
        "type": "influxdb",
        "z": "",
        "hostname": "influxdb",
        "port": "8086",
        "protocol": "http",
        "database": "wled",
        "name": "",
        "usetls": false,
        "tls": ""
    }
]

@poofyteddy
Copy link
Author

I already have a grafana setup but only with carbon as a backend, i will install influxdb soon but not right now.

I have backed up everything, and plugged the device way closer to my AP, 100% -39dbm.
We will see, if it drop i will try with node-red to pull like pbolduc did, just to see, and if it drop again i will downgrade :)

@poofyteddy
Copy link
Author

no success with @pbolduc pinging
Screenshot from 2020-09-11 14-33-59
i made some edit to node-red flow in order to push it to a home assistant sensor.

what would be the best way to fully wip the es8266 before a new flash ? when i flashed 10.0.1 binary over the 10.0.0, the setting wasn't wiped

@sansillusion
Copy link

Hi, I had constant reboots and crashes on my esp32 but when I turned off the option "Auto brightness adjustment" in led settings all reboot/crashes went away. Might be a good idea to try it.

@poofyteddy
Copy link
Author

Thank you @sansillusion , I just try but without success.

I'm not surprise because (and i insist on this) it does not crash, only the network stop working.

@pbolduc
Copy link
Contributor

pbolduc commented Sep 11, 2020

Do you have an extra microcontroller? Wondering if debug build with monitoring serial console can help shed light on what is happening.

@poofyteddy
Copy link
Author

i have a second one on hand, but i will have to read the doc about building a .bin

@pbolduc
Copy link
Contributor

pbolduc commented Sep 11, 2020

i have a second one on hand, but i will have to read the doc about building a .bin

Check out Digiblur's video on using VS Code + Platform IO to compile Tasmota. It is the same for WLED

https://www.youtube.com/watch?v=Sz2zc_0PdiY

@toto79
Copy link
Contributor

toto79 commented Sep 11, 2020

you can give https://gitpod.io#https://github.com/Aircoookie/WLED/tree/master an try

run 1st: pip3 install -U platformio
run 2nd: platformio run -e env:type , e.g. platformio run -e d1_mini or platformio run -e esp01_1m_full (as defined in platformio.ini)

find your firmware.bin file in folder .pio/build/env:type, save as... and flash to your esp
good summary you'll find also at https://tasmota.github.io/docs/Gitpod/

@poofyteddy
Copy link
Author

i have setup platformio in vscode, but can't find the build parameter for the bin in the github release.
What is in platformio.ini doesn't match what is in https://github.com/Aircoookie/WLED/wiki/Install-WLED-binary#what-binary-should-i-use

Building something that's isn't iso will not help :(
did i miss a file or a piece of doc ?
The closest i could find was

[env:travis_esp8266]
extends = env:d1_mini
build_type = debug
build_flags = ${common.build_flags_esp8266} ${common.debug_flags} ${common.build_flags_all_features}

@pbolduc
Copy link
Contributor

pbolduc commented Sep 11, 2020

If you open the project with Platform IO, you should be able to see the Platform IO tasks window, click on the build and then upload buttons

image

@poofyteddy
Copy link
Author

poofyteddy commented Sep 11, 2020

but is d1_mini the one uploaded as WLED_0.x.x_ESP8266.bin on github ?
Building it work, i'll just have to override it with the build_type = debug if it's the right one
EDIT: or not since env:d1_mini_debug already have everything

@pbolduc
Copy link
Contributor

pbolduc commented Sep 11, 2020

these are all the debug flags in platform.io

debug_flags = -D DEBUG=1 -D WLED_DEBUG -DDEBUG_ESP_WIFI -DDEBUG_ESP_HTTP_CLIENT -DDEBUG_ESP_HTTP_UPDATE -DDEBUG_ESP_HTTP_SERVER -DDEBUG_ESP_UPDATER -DDEBUG_ESP_OTA -DDEBUG_TLS_MEM

you could start by trying d1_mini_debug but that enables all those flags above. You may be flooded with output but sometimes it is better to have too much output than too little.

You can create your own config (you can put it in a new file platformio_override.ini to avoid making changes to the checked in one)

[env:d1_mini_poofteddy]
extends = env:d1_mini
build_type = debug
build_flags = ${common.build_flags_esp8266} -DDEBUG_ESP_WIFI 

This configuration only debugs the ESP_WIFI stuff

@poofyteddy
Copy link
Author

i think the default debug build is to much because the device is rebooting a lot. I'll try with only DDEBUG_ESP_WIFI :) Thank's

@poofyteddy
Copy link
Author

just dropped and came back, nothing on the serial terminal maybe i need to add more flag

@pbolduc
Copy link
Contributor

pbolduc commented Sep 12, 2020

Be sure to have WLED_DEBUG defined too, otherwise the serial statements do nothing: https://github.com/Aircoookie/WLED/blob/2716f4cbe9197a8c45539efa3e359c3d3deea75b/wled00/wled.h#L127-L138

And then I would expect some of these https://github.com/Aircoookie/WLED/blob/2716f4cbe9197a8c45539efa3e359c3d3deea75b/wled00/wled.cpp#L463-L487 statements to be printed.

@poofyteddy
Copy link
Author

Done,

---DEBUG INFO---
Runtime: 40009
Unix time: 40
Free heap: 26888
Wifi state: 3
State time: 0
NTP last sync: 999000000
Client IP: 192.168.90.177
Loops/sec: 7795

i'm waiting for it to drop :)

@poofyteddy
Copy link
Author

it dropped 15mn ago, and i still get

---DEBUG INFO---
Runtime: 2910884
Unix time: 2910
Free heap: 26720
Wifi state: 3
State time: 0
NTP last sync: 999000000
Client IP: 192.168.90.177
Loops/sec: 7857

Sooo, the wifi stack work, but not the ip stack ?

@poofyteddy
Copy link
Author

i have enabled all debug flag from pbolduc post, but still get nothing more when the ip stop replying...
image
it's not the same controller as the first one so a hardware issue is out.
I will have to find a way to try with my phone AP... just to make sure it's not the wifi network

@pbolduc
Copy link
Contributor

pbolduc commented Sep 12, 2020

This is frustrating. It is clear that the WiFi component is thinking it is connected (state = 3). Does your Unifi controller say it is connected? Do you have this running on a different VLAN with firewall rules?

@poofyteddy
Copy link
Author

it is on another vlan with firewall rule, but it shouldn't have any rule between me and the device. only between the device and internet.
I don't have anything changing on the unifi controller web page when the ip drop...
image
i that's why i will try with the phone AP

@poofyteddy
Copy link
Author

So after a lot of network testing this sunday, i found that this option enable seams to cause the behavior.
image
This make no sens to me... Why would not being able to broadcast (which should be limited on a wifi network) cause it to no-longer respond 🤔

@poofyteddy
Copy link
Author

Can someone with a Unifi WIFI AP try to enable this and tell me if it break the same way ?
I have a lot of IOT stuff on this network who broadcast trash for no reason... bringing the bandwidth down :(
I would like to keep it up

@pbolduc
Copy link
Contributor

pbolduc commented Sep 15, 2020

Can someone with a Unifi WIFI AP try to enable this and tell me if it break the same way ?
I have a lot of IOT stuff on this network who broadcast trash for no reason... bringing the bandwidth down :(
I would like to keep it up

Earliest I might be able to test is in about 6 hours. I recommend you ask on discord as more people are watching there than this issue.

@toto79
Copy link
Contributor

toto79 commented Sep 15, 2020

i still test it with 2 instances of wled (1x 0.10.2 & 1x 0.10.0) but since 15 min i see no issues. All works fine, i will keep the test running for the next 8 houres.

@poofyteddy
Copy link
Author

poofyteddy commented Sep 15, 2020

Earliest I might be able to test is in about 6 hours

I'm not at a day you know :) Thank's a lot
I don't have much luck on discord ... i tend to have issue with live chat.
Thank you @toto79, let me know if it break at any point.
From what i have seen it look like enabling this option turn on some kind of client isolation to (can't ping scan from something on the network), even with guest mode disable.
I don't remember reading this in the Unifi doc.

@toto79
Copy link
Contributor

toto79 commented Sep 15, 2020

correction: connectivity between laptop and wled drops immediately if i enable this block broadcast checkbox. I didn't recognised it, because my laptop switched (while provisioning) to another (from unifi) independent wifi.
so if i connect from lan side it works (Laptop -> Wifi-AP-> LAN -> USG-> AC-Pro->Wifi->WLED) ...no timeouts, ping ok, webIF ok , connectivity 88-100%
If I'm within the WLED Wifi network (Laptop -> Wifi -> AC-Pro -> WiFi-> WLED) , webIF timeout, ping timeout (but as mentioned connection between Clients and Unfi are still ok, but the routing does not work anymore. Other devices like Tasmota are also not responding

@poofyteddy
Copy link
Author

Thank's for that @toto79, i wonder why mine drop between 20 to 40mn after ...
i also wonder when did unifi changed "Client isolation" from the get network type, to this option which is called something it isn't...
maybe you going through USG have something to do with the lack of drop.

@Legsmaniac
Copy link

I have UniFi (love it!) and about 7 ESP8266 devices (plus the occasional test devices) running WLED and have never experienced any problems with either WiFi or AP on any device. They can all run for several hours, running into days/weeks without a hitch.

Actually, tell a lie, there has been a couple of occasions I experienced drop-outs (not with WLED but some other IoT devices) which caused me some head-scratching in the beginning, until a simple reboot of the UniFi device fixed all and any problems I was having at that time. Now, if any glitches start appearing, first thing I do is reboot UniFi and it's sorted.

@poofyteddy
Copy link
Author

poofyteddy commented Sep 16, 2020

ah ! mistake where made XD wrong button was pushed
@Legsmanic, do you mean reboot the unifi software or the AP ? because i have moved and swapped the AP a lot last week, without success.

@poofyteddy poofyteddy reopened this Sep 16, 2020
@adlerweb
Copy link

Guess I'll join the club. Also Unifi, also ESP8266, also WiFi problems. For me a saw a very high latency (~200-300ms) and drops every few seconds. I already modified a lot of stuff before noticing the WiFi problem, so no idea if my solution is applicable for vanilla WLED, but for ne not forcing 802.11n solved most Unifi problems (2ms latency, no drops). You could try removing WiFi.setPhyMode(WIFI_PHY_MODE_11N);
https://github.com/Aircoookie/WLED/blob/f697f3d7ada2dfdd3cead902afbff697eaf8da20/wled00/wled.cpp#L303
and check if the connection is now stable.

@poofyteddy
Copy link
Author

i will try on the other controller when i have a couple of hours @adlerweb
But the fact that it drop when unable to communicate with other, it feel like it's trying to talk to something, and give up by doping.

@Aircoookie
Copy link
Member

@adlerweb thank you for the feedback! Forcing mode N was added to resolve connectivity issues with Asus routers IIRC. I believe it would probably be easiest if I add a checkbox so it can be disabled at runtime.

@stale
Copy link

stale bot commented Jan 30, 2021

Hey! This issue has been open for quite some time without any new comments now. It will be closed automatically in a week if no further activity occurs.
Thank you for using WLED!

@stale stale bot added the stale This issue will be closed soon because of prolonged inactivity label Jan 30, 2021
@stale stale bot closed this as completed Feb 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug stale This issue will be closed soon because of prolonged inactivity
Projects
None yet
Development

No branches or pull requests

8 participants