Skip to content

Connection drops for batch of clients (ESP8266) at the same time #151

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
mkin1337 opened this issue Jul 5, 2019 · 6 comments
Open

Connection drops for batch of clients (ESP8266) at the same time #151

mkin1337 opened this issue Jul 5, 2019 · 6 comments
Labels

Comments

@mkin1337
Copy link

mkin1337 commented Jul 5, 2019

I am struggling with a project that has things (ESP8266 as clients) communicating to VerneMQ running on a DigitalOcean droplet. We have about 200 things at different locations sending about 20msg per minute each. Yesterday 8 things (out of 40 at that location) suddenly were not able to connect to the server/publish anymore although they have been online. Today 4 other clients (out of 20) at another location were refused by the MQTT server (I guess) or could not connect the same second.

All clients were able to connect to MQTT again by simply restarting them.

First we thought it was a network issue but it all seems to be fine. Can you help us troubleshooting the issue? We run a DigitalOcean droplet for our mqtt server. We do not think that it's the ESP8266 as there is no reason for them to all drop connection at the very same time. But we want to be sure.

Thank you very much in advance. Please let me know if you need some more specs.

Environment
VerneMQ Version: 1.7.1
OS: Ubuntu 16.04.6 x64
Server config: 4 GB Memory / 80 GB Disk
Cluster size/standalone: 1

Expected behavior
Clients connect to the MQTT server and publish data to diverse topics. Each client has a unique ID (MAC address).

Actual behaviour
Clients connect to the MQTT server and publish data to diverse topics. Occasionally a batch of clients drops connection at the same time and is not able to reconnect unless the devices are rebooted.

@arihantdaga
Copy link
Contributor

I think @kinitzki This is related in someway with the esp8266 only. In my case also it used to happen, few of devices would simultaneously get disconnected from MQTT broker. In my case devices seem to be stuck in something where even the main loop() function wouldn't execute. Hence no manual button or sensors, led would work. This thing was fixed after i updated my esp8266 arduino core. However some problem still exists, related to wifi sleep and half open connections possible with this library. There are other open issues on this repo and esp8266 repo for the same.

@mkin1337
Copy link
Author

@arihantdaga would you be so kind telling me how you solved that on what your current "status quo" is?

@arihantdaga
Copy link
Contributor

@kinitzki , There are few things.
First problem - Device will get stuck and even the main loop functions will not work - This was solved with updating the esp8266 arduino core.
Second problem - I have noticed that many times (Often when my router was working, however there was some issue from my ISP and it wasn't getting internet for a brief period of time. ), the device would not connect even after internet is available. I digged into it, and i found there could be a couple of reasons. Long story short, to fix the problem, i have implemented two things. Which will ensure that device will connect again for sure.

  1. Fixes #105 . Mqtt half open connections.  #156
  2. I changed connect() function like this (I'll create a pull request later)
bool AsyncMqttClient::_connect(){
  bool connect = false;
  #if ASYNC_TCP_SSL_ENABLED
  if (_useIp) {
      connect = _client.connect(_ip, _port, _secure);
    } else {
      connect = _client.connect(_host, _port, _secure);
    }
  #else
    if (_useIp) {
      connect = _client.connect(_ip, _port);
    } else {
      connect = _client.connect(_host, _port);
    }
  #endif
  return connect;
}

bool AsyncMqttClient::connect() {
  if (_connected) return true;
  bool connect = _connect();
  if(!connect){
    // If Could Not connect then there could be 2 reasons,
    // 1. _pcb already Exists
    // 2. could not allocate new pcb.  // Not sure, Why This could happen. 
    // So we have to free _pcb. 
    // And re try. 
    _client.close(true); // Close Now. 
    return _connect();
  }
  return connect;
}

You can also give this a try. I think this would solve the problem.

@jeroenst
Copy link
Contributor

jeroenst commented Nov 29, 2019

Is this maybe related to esp8266/Arduino#5083
?

I don't experience this problem using WiFi.setSleepMode(WIFI_NONE_SLEEP);

@YuriRB
Copy link

YuriRB commented Feb 6, 2020

Try to use WiFi.setSleepMode(WIFI_NONE_SLEEP), and remove WiFi.mode().

@bertmelis
Copy link
Contributor

Is this issue still relevant?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants