Description
Hi
we have stumbled across wifi issues which we were able to mitigate by adding a delay(1) (ie effectively allowing background tasks like wifi handling to run) directly in the main loop or when adding serial output messages at random places.
we found that one simple for..next loop (iteration from 0 to 10) seems to add to the issue, however 0 to 10 shouldn't really block the ESP8266 very long, especially since we're only checking a few very simple evaluations there (mostly comparing bool variables). Adding a serial output in a place which isn't even executed 99,9% of the time, also mitigates the wifi issue.
So we came to the conclusion that some compiler optimizations must be messing with us and so we tried to use -O2 instead of the default -Os compiler flags and with that, wifi works flawlessy, without adding any delays or serial outputs in our code.
Since our code is huge and complex, I am not able to post a minimal sketch to reproduce. Any small change to the code can completely change the behaviour. But as an example, I can demonstrate which completely unlogical changes make wifi processing work again:
for (uint8_t n = 0; n < RFM69_TX_QUEUE_LENGTH; n++)
{
// delay(1);
// Only work on packets where NewPacket = true and if they have retries left.
// Do not work on Packets where ACKReceived = true, since those have already
// been sent AND ACKed by the peer. They are only waiting to be cleared by user code.
if (this->TXQueue[n].NewPacket == false || this->TXQueue[n].TXRetries == 0 || this->TXQueue[n].ACKReceived == true)
continue;
// LOG("%u", n);
ADDITIONAL STUFF HAPPENS HERE....
}
So this for..next loop processes waiting packets in a TX queue. Most of the time there are no packets in that queue, so the loop continues right after the first if statement. Wifi is not working (not connecting) with the default compiler flags -Os with the code above.
If I uncomment the LOG output AFTER the if statement (which just continues most of the time thus the LOG doesn't even get executed anyway), wifi suddenly connects and works again. If I leave the code as it is and use -O2 instead of -Os, wifi starts working immediately as well.
LOG is a macro which calls Serial.printf(), nothing special. Changing code in a completely unrelated place led us to the conclusion that compiler optimizations must be messing with us and to me it looks like this is indeed the case.
Without deeper analysis of our code - can it be generally said that -Os can be problematic? Is our code probably the issue? What should we be looking for? Is there a way to find out what exactly is blocking wifi from connecting correctly? Should we just use -O2 and be happy?
Can please someone advise whether this is probably an issue with the compiler optimizations (and not our fault) or whether we should dig deeper into our code and give us some directions.
Thanks!
PS: made a github issue instead of a forum post bc. maybe this is really related to the chosen compiler optimizations flags.