Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`

This thread has been locked for further replies. You can start a new thread to share your ideas or ask questions.
12

Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`

This thread has been locked for further replies. You can start a new thread to share your ideas or ask questions.
Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-06-27 20:36:19 - last edited 2021-07-06 06:39:24
Model: ER605 (TL-R605)  
Hardware Version: V1
Firmware Version: 1.1.0

Several times a controller (software based, in docker) is losing all the devices and re-adopting them. Because of this, all the clients lose connection and internet access...

Omada mobile app sends such notifications during this:

 

 

Controller logs have only WAN-down:

 

Docker container has more helpful logs of the controller app itself:

2021-06-21 00:27:58 [Thread-28] [INFO]-[SourceFile:146] - Close connection to service server.
2021-06-21 00:27:58 [Thread-39] [INFO]-[SourceFile:921] - Thread 'heartBeatThread' is stopped
2021-06-21 00:27:58 [Thread-38] [INFO]-[SourceFile:792] - expiredRequestCleanThread is interrupted.
2021-06-21 00:27:58 [Thread-38] [INFO]-[SourceFile:796] - Thread 'expiredRequestCleanThread' is stopped
2021-06-21 00:27:58 [Thread-37] [INFO]-[SourceFile:837] - javax.net.ssl.SSLException: Socket closed
javax.net.ssl.SSLException: Socket closed
	at sun.security.ssl.Alert.createSSLException(Alert.java:127) ~[?:1.8.0_292]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:324) ~[?:1.8.0_292]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:267) ~[?:1.8.0_292]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:262) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1554) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.access$400(SSLSocketImpl.java:73) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:964) ~[?:1.8.0_292]
	at com.tplink.eap.cloudsdk.client.p.a(SourceFile:120) ~[cloudsdk-1.0.5.jar:?]
	at com.tplink.eap.cloudsdk.client.s.run(SourceFile:830) [cloudsdk-1.0.5.jar:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Caused by: java.net.SocketException: Socket closed
	at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_292]
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_292]
	at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_292]
	at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:457) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1332) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:948) ~[?:1.8.0_292]
	... 3 more
2021-06-21 00:27:58 [Thread-37] [INFO]-[SourceFile:839] - Thread 'recvThread' is stopped
2021-06-21 00:27:58 [Thread-28] [INFO]-[SourceFile:127] - Connect service server automatically, ConnectionType is PERSISTENT_CONNECTION.
2021-06-21 00:27:58 [Thread-28] [INFO]-[SourceFile:366] - The result of connection is false.
2021-06-21 00:27:58 [Thread-28] [INFO]-[SourceFile:146] - Close connection to service server.
2021-06-21 00:27:58 [Thread-28] [INFO]-[SourceFile:110] - Connect SEF server automatically.
2021-06-21 00:29:03 [Thread-28] [INFO]-[SourceFile:194] - The result of connection is true, service host is n-devs-smb.tplinkcloud.com, port is 443.
2021-06-21 00:29:03 [Thread-28] [INFO]-[SourceFile:114] - Close connection to SEF server.
2021-06-21 00:29:03 [Thread-1702] [INFO]-[SourceFile:403] - javax.net.ssl.SSLException: Socket closed
javax.net.ssl.SSLException: Socket closed
	at sun.security.ssl.Alert.createSSLException(Alert.java:127) ~[?:1.8.0_292]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:324) ~[?:1.8.0_292]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:267) ~[?:1.8.0_292]
	at sun.security.ssl.TransportContext.fatal(TransportContext.java:262) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1554) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.access$400(SSLSocketImpl.java:73) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:964) ~[?:1.8.0_292]
	at com.tplink.eap.cloudsdk.client.p.a(SourceFile:120) ~[cloudsdk-1.0.5.jar:?]
	at com.tplink.eap.cloudsdk.client.o.run(SourceFile:397) [cloudsdk-1.0.5.jar:?]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
Caused by: java.net.SocketException: Socket closed
	at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_292]
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_292]
	at java.net.SocketInputStream.read(SocketInputStream.java:171) ~[?:1.8.0_292]
	at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:457) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:68) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1332) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl.access$300(SSLSocketImpl.java:73) ~[?:1.8.0_292]
	at sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:948) ~[?:1.8.0_292]
	... 3 more
2021-06-21 00:29:03 [Thread-1702] [INFO]-[SourceFile:407] - ssl recv thread is stopped.
2021-06-21 00:29:03 [Thread-28] [INFO]-[SourceFile:127] - Connect service server automatically, ConnectionType is PERSISTENT_CONNECTION.
2021-06-21 00:29:23 [Thread-28] [INFO]-[SourceFile:320] - Exception occurs when connecting service server:java.net.UnknownHostException: n-devs-smb.tplinkcloud.com: Temporary failure in name resolution
2021-06-21 00:29:23 [pool-32-thread-1] [WARN]-[SourceFile:577] - check device bind status error: -7119, Operation failed. Network connection error. 
2021-06-21 00:30:48 [Thread-28] [INFO]-[SourceFile:366] - The result of connection is true.
2021-06-21 04:47:48 [scheduled-pool-8] [INFO]-[SourceFile:61] - Roaming status unstable, roaming interval changed to 20s
2021-06-21 04:57:48 [scheduled-pool-4] [INFO]-[SourceFile:54] - Roaming status stable, roaming interval changed to 8h
2021-06-21 08:36:11 [topology-task-0] [INFO]-[SourceFile:221] - topology of switch 00-5F-67-00-00-00 changed
2021-06-21 08:39:41 [topology-task-0] [INFO]-[SourceFile:221] - topology of switch 00-5F-67-00-00-00 changed

 

Apparently, my switch hasn't been moved\re-plugged\etc (referring to the last log line regarding topology of switch changed).

  0      
  0      
#1
Options
1 Accepted Solution
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`-Solution
2021-07-06 06:39:11 - last edited 2021-08-13 12:18:30

Hi All,

 

Here is the follow-up on this case.

 

From the running log, R&D found that the readoption issue is caused by a reboot of the controller, and the problem has been located:

 

When the Controller gets started, it will provision the DST settings to the Devices ahead of the time when the Devices are successfully adopted, leading to the version of configuration inconsistency, which results in that the Controller will provision the full configuration to the Devices once they are successfully adopted.

 

As you may know, the Devices will be reconfigured during the provisioning(you will see the WAN is down and then up soon), then everything returns back to normal after the configuration is completed (this normally takes about 5 minutes).


The official firmware release to fix this issue may not be ready so fast, if you suffer from the issue seriously or require to fix it urgently, please don't hesitate to let me know so that I can apply for a beta firmware to you accordingly.

Note: to get a beta firmware, please let me know the Controller you are using (Software Windows/Linux or Hardware OC200/OC300).

 

Update on 13 August, 2021:

 

This issue has been resolved on Controller v4.4.4, you may visit the official website or check the following post for an update.

[New Firmware] Omada Software Controller v4.4.4 has been released on August 13, 2021

 

Thank you for your attention!

>> Omada EAP Firmware Trial Available Here << *Try filtering posts on each forum by Label of [Early Access]*
Recommended Solution
  3  
  3  
#5
Options
11 Reply
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-06-28 10:42:43

Dear @Shurov,

 

Shurov wrote

Several times a controller (software based, in docker) is losing all the devices and re-adopting them. Because of this, all the clients lose connection and internet access...

 

Are you using the Omada Controller v4.3.5 for Linux? Does the issue start from the first day you install the Controller?

 

Could you please export the Running Log under Settings -> Services -> Export Data from your Controller?

>> Omada EAP Firmware Trial Available Here << *Try filtering posts on each forum by Label of [Early Access]*
  0  
  0  
#2
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-06-28 11:08:15

@Fae , yep, using latest version 4.3.5 on Linux. Until late May I had only APs - and I constantly saw reconnection cycle notification from mobile app. In late May I went all-Omada, purchasing a switch and a router. And using them, issue got worse, more noticeable without longer internet\wifi absence - because re-adoption of them on top of just APs takes much longer time.

I'm sure that those reconnects (of just APs at that moment) have been happening long ago, even before 4.3.5 (I can see release date at late April).

 

Please find logs attached. I've removed all the client logs and left ones only for the managed devices.

File:
Omada_log.csvDownload
  0  
  0  
#3
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-06-29 07:07:06

Dear @Shurov,

 

Shurov wrote

@Fae , yep, using latest version 4.3.5 on Linux. Until late May I had only APs - and I constantly saw reconnection cycle notification from mobile app. In late May I went all-Omada, purchasing a switch and a router. And using them, issue got worse, more noticeable without longer internet\wifi absence - because re-adoption of them on top of just APs takes much longer time.

I'm sure that those reconnects (of just APs at that moment) have been happening long ago, even before 4.3.5 (I can see release date at late April).

Please find logs attached. I've removed all the client logs and left ones only for the managed devices.

 

Thank you for verifying the information. Upon the client logs, I did find the device was randomly readopted automatically multiple times.

 

More detailed information such as the config file might be required to figure out the problem.

 

For your case, I'd like to escalate to the TP-Link support team who could help you more efficiently.

They will reach you via your registered email address shortly, please pay attention to your email box later.

Once the issue is addressed or resolved, I'd encourage you to share it with the community.

Thank you so much for your cooperation and support!

>> Omada EAP Firmware Trial Available Here << *Try filtering posts on each forum by Label of [Early Access]*
  0  
  0  
#4
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`-Solution
2021-07-06 06:39:11 - last edited 2021-08-13 12:18:30

Hi All,

 

Here is the follow-up on this case.

 

From the running log, R&D found that the readoption issue is caused by a reboot of the controller, and the problem has been located:

 

When the Controller gets started, it will provision the DST settings to the Devices ahead of the time when the Devices are successfully adopted, leading to the version of configuration inconsistency, which results in that the Controller will provision the full configuration to the Devices once they are successfully adopted.

 

As you may know, the Devices will be reconfigured during the provisioning(you will see the WAN is down and then up soon), then everything returns back to normal after the configuration is completed (this normally takes about 5 minutes).


The official firmware release to fix this issue may not be ready so fast, if you suffer from the issue seriously or require to fix it urgently, please don't hesitate to let me know so that I can apply for a beta firmware to you accordingly.

Note: to get a beta firmware, please let me know the Controller you are using (Software Windows/Linux or Hardware OC200/OC300).

 

Update on 13 August, 2021:

 

This issue has been resolved on Controller v4.4.4, you may visit the official website or check the following post for an update.

[New Firmware] Omada Software Controller v4.4.4 has been released on August 13, 2021

 

Thank you for your attention!

>> Omada EAP Firmware Trial Available Here << *Try filtering posts on each forum by Label of [Early Access]*
Recommended Solution
  3  
  3  
#5
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-07-06 07:46:58

@Fae Yes Please give us a beta maybe this  fix all the problem with this router, I use OC200

  0  
  0  
#6
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-07-20 21:55:07

@Fae 

 

Please provide a beta version.
I am using the Windows Controller version.

  0  
  0  
#7
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-07-21 03:47:12

Dear @shberge, @canopus,

 

Sorry that the beta firmware won't be provided as the official firmware will be released soon.

 

It's suggested to wait for the official release, you may pay attention to this post for an update.

 

Thank you for your great cooperation and patience!

>> Omada EAP Firmware Trial Available Here << *Try filtering posts on each forum by Label of [Early Access]*
  0  
  0  
#8
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-07-21 20:46:57

Running Omada Controller on Linux 4.4.3 and had this issue today.

 

All clients were disconnected at the time this issue happened.

 

This was also happening on 4.3.5. EAP 225 v3.0

 

2021-07-21 17:34:36 [Thread-23] [INFO]-[SourceFile:833] - 'recvThread' exception:java.net.SocketException: There is no more data because the end of the stream has been reached.
2021-07-21 17:34:36 [Thread-23] [INFO]-[SourceFile:839] - Thread 'recvThread' is stopped
2021-07-21 17:34:41 [Thread-25] [INFO]-[SourceFile:913] - java.net.SocketException: Connection or outbound has closed
java.net.SocketException: Connection or outbound has closed
        at sun.security.ssl.SSLSocketImpl$AppOutputStream.write(SSLSocketImpl.java:1018) ~[?:1.8.0_261]
        at java.io.OutputStream.write(OutputStream.java:75) ~[?:1.8.0_261]
        at com.tplink.eap.cloudsdk.client.p.a(SourceFile:111) ~[cloudsdk-1.0.5.jar:?]
        at com.tplink.eap.cloudsdk.client.q.a(SourceFile:484) ~[cloudsdk-1.0.5.jar:?]
        at com.tplink.eap.cloudsdk.client.q.a(SourceFile:418) ~[cloudsdk-1.0.5.jar:?]
        at com.tplink.eap.cloudsdk.client.t.run(SourceFile:889) [cloudsdk-1.0.5.jar:?]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_261]
2021-07-21 17:34:41 [Thread-25] [INFO]-[SourceFile:921] - Thread 'heartBeatThread' is stopped
2021-07-21 17:34:41 [Thread-19] [INFO]-[SourceFile:146] - Close connection to service server.
2021-07-21 17:34:41 [Thread-24] [INFO]-[SourceFile:792] - expiredRequestCleanThread is interrupted.
2021-07-21 17:34:41 [Thread-24] [INFO]-[SourceFile:796] - Thread 'expiredRequestCleanThread' is stopped
2021-07-21 17:34:41 [Thread-19] [INFO]-[SourceFile:127] - Connect service server automatically, ConnectionType is PERSISTENT_CONNECTION.
2021-07-21 17:36:50 [Thread-19] [INFO]-[SourceFile:320] - Exception occurs when connecting service server:java.net.ConnectException: Connection timed out (Connection timed out)
2021-07-21 17:42:10 [Thread-19] [INFO]-[SourceFile:320] - Exception occurs when connecting service server:java.net.ConnectException: Connection timed out (Connection timed out)
 

 

  0  
  0  
#9
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-08-14 09:42:09

@Fae unfortunately the problem still occurs 

Controller Version: 4.4.4

EAP225 updated to 5.0.5

ER605 v1.0 updated to 1.1.1

 

when starting software controller, it shuts down wan port - disconnects all devices, readopts and reconnects. This takes like a 1-2min

 

 

 

 

  0  
  0  
#10
Options
Re:Controller is randomly re-adopting everything due to `javax.net.ssl.SSLException: Socket closed`
2021-08-14 17:38:57

@Fae

 

Did an upgrade (including AP firmware EAP225 EAP245)

rebooted everything

Running on Linux

Still same behavior, when server restarts all 3 AP's get re-adopted.

 

-H-

 

  0  
  0  
#11
Options