AP disconnecting for no apparent reason
The network has been working quite well thanks to several of you out there. For that I am extremely grateful.
Mesh network consists of two CPEs and 6 APs. For some reason, the root has been disconnecting. Yesterday, it did it 8 times.
First I though it was a power failure, but I still seem to be able to see the AP only it reports it's disconnected. Then it suddenly reconnects.
At first, I though if we lost the root AP, the network would go down. It doesn't. Guess that means the mesh is working as it was designed to, but this unexplained disconnecting is driving me nuts!
What can I do to figure this out?
- Copy Link
- Subscribe
- Bookmark
- Report Inappropriate Content
@Byteguy I should also mention that I was surprised that the network was still working since it is the root AP (wired to the client CPE). All communication goes throuigh this AP.
When it is reported as disconnected, I apparently can still see it through Omada Cloud as well as monitor all the other APs. I iniitially thought it was a power issue, but if that were so, both the AP and the CPE would have been offline.
Am not sure just what is happening. I have rebooted the device and the latest firmware has been installed. It has been workiing well for about 3 months. This problem first appeared a couple days ago but doesn't appear to occur with any regularity. One day it went down 6 or 7 times and a couple days later it happened again a number of times.
Latest thing I have done is reassign the IP to a different value in case there is some unknown conflict. Other APs have not had this problem.
- Copy Link
- Report Inappropriate Content
Hi @Byteguy,
How do you have 6 AP's? I though that root nodes only supported 4 MESH connected downlinks (for a total of 5 AP's). Or maybe the 6th AP is MESH'ed to one of these 4?
Anyway, are you seeing the root node disconnected error in Omada / OC-200? This can occur for a number of potential reasons:
1) loss of power / power cycle / reboot (planned or unplanned) of the AP -- this will definately cause wireless clients to lose connectivity to the Internet as the root node is their only path to the CPE unlink.
2) poor network connectivity between the root node and CPE
bad cables, misaligned CPE's, wifi interference (on the CPE channel), poor connectiviy (cable, swtich, etc) between the CPE and the OC-200/Omada server.
Depending on the failure mode, client access to the Internet might still be working, most of the time, though perhaps not as well as it could.
Could also be corrosion, cracked / pinched cable, etc.
3) Is it getting warm where you are? I'm also wondering if the root node is overheating and shutting down / restarting. Though you should have good airflow.
4) less likely, but power supply issues to the OC-200. Are you using USB or PoE for this? Where in the network topology is the OC-200 located. Try to put it "closer" to the root node.
You could even try putting it with the root node (you'll need a small ethernet switch and weather tight compartment). connect the Ethernet from the CPE to the swtich and then connect the switch to the OC-200 and Root Node. If you use a PoE switch it could power both the OC-200 and the AP.
Of course the problem with this approach is that if your CPE link goes down, you'll loose access to the OC-200 as well. BUT, the OC-200 is necessary for fast roaming and dynamic MESH re-configuration so your AP's need reliable access to it. Of course if the CPE uplink is down, this won't really matter since there's no internet.
Some ideas for the future:
A) by a spare one or two EAP225-Outdoors to swap in if you suspect failure / flakiness. I don't think we really know how durable these are yet.
B) for the future, consider a more robust / fault tolerant CPE/root node solution. With the OC-200 proximal to the Wifi network. Ideally you would have, for example, two root nodes, each with their own CPE backhauls, with load balancing and failover.
This will also improve the overall throughput / performance of the Wifi network up to the limit of your Comcast connection.
-Jonathan
- Copy Link
- Report Inappropriate Content
@JSchnee21 I'll answer the questions by inserting comments in your answer:
How do you have 6 AP's? I though that root nodes only supported 4 MESH connected downlinks (for a total of 5 AP's). Or maybe the 6th AP is MESH'ed to one of these 4?
**You are correct. The root AP has 4 connections. The latest one we installed is linked to anther AP
Anyway, are you seeing the root node disconnected error in Omada / OC-200? This can occur for a number of potential reasons:
1) loss of power / power cycle / reboot (planned or unplanned) of the AP -- this will definately cause wireless clients to lose connectivity to the Internet as the root node is their only path to the CPE unlink.
**Both the AP and CPE are connected by CAT-5/PoE. Mounted on the same pole.
2) poor network connectivity between the root node and CPE
bad cables, misaligned CPE's, wifi interference (on the CPE channel), poor connectiviy (cable, swtich, etc) between the CPE and the OC-200/Omada server.
Depending on the failure mode, client access to the Internet might still be working, most of the time, though perhaps not as well as it could.
**When we get the disconnected alert, all clients are routed through other APs. The clients connected to this node show as "0"
Could also be corrosion, cracked / pinched cable, etc.
3) Is it getting warm where you are? I'm also wondering if the root node is overheating and shutting down / restarting. Though you should have good airflow.
**Just the opposite. Weather is crappy. Some nice days, but we're still wondering when summer will arrive.
4) less likely, but power supply issues to the OC-200. Are you using USB or PoE for this? Where in the network topology is the OC-200 located. Try to put it "closer" to the root node.
**setup is router>4-port PoE switch>OC-200>Cat5 cable>CPE~~~wireless~~~CPE>AP (root)~~~wireless to all other APs. As I recall, the TP-Link people told me to do it this way.
You could even try putting it with the root node (you'll need a small ethernet switch and weather tight compartment). connect the Ethernet from the CPE to the swtich and then connect the switch to the OC-200 and Root Node. If you use a PoE switch it could power both the OC-200 and the AP.
Of course the problem with this approach is that if your CPE link goes down, you'll loose access to the OC-200 as well. BUT, the OC-200 is necessary for fast roaming and dynamic MESH re-configuration so your AP's need reliable access to it. Of course if the CPE uplink is down, this won't really matter since there's no internet.
Some ideas for the future:
A) by a spare one or two EAP225-Outdoors to swap in if you suspect failure / flakiness. I don't think we really know how durable these are yet.
**My thoughts exactly. I'll tell the marina manager to do this.
B) for the future, consider a more robust / fault tolerant CPE/root node solution. With the OC-200 proximal to the Wifi network. Ideally you would have, for example, two root nodes, each with their own CPE backhauls, with load balancing and failover.
**I generally works just fine. Boat owners are complimenting the marina people on the great improvement over what they once had! The last two things I did I believe you suggested : Set 5Ghz band to channel 36 and assigned all APs their own IP after tweaking DHCP on the Comast router to avoid conflict.
This morning, I changed the IP on the problem AP.
Last disconnect experienced was at 7:41 this morning. Reconnected at 8:00. Changed IP at 8:03. No problems since.
Fingers remain crossed at this point.
I am wondering about the second set of CPEs. If I had another set, would there need to be a 2nd OC-200?
This will also improve the overall throughput / performance of the Wifi network up to the limit of your Comcast connection.
-Jonathan
- Copy Link
- Report Inappropriate Content
@Byteguy Just had a thought. Although the AP says it's disconnected, since it is the root AP, it has to be working otherwise no one could get on the internet. What I think is going on is the 2.4GHz band is going down, but the 5GHz is still working. All APs communicate on the 5GHz band, right? Its just that the AP is refusing client connections. Probably can't allow a 5GHz connection either.
So, what could be wrong to cause this? Like I said, although the AP says it's disconnected, I actually access all APs through it. All clients are moved to another AP. This rules out a bad cable or faulty power.
Still waiting to see if my changing the AP IP address makes any difference.
- Copy Link
- Report Inappropriate Content
@Byteguy It's now 6:30 am and no problems since yesterday morning. Keeping fingers crossed. I believe I will ask the TP-Link tech people if they have any idea why the AP wouild suddenly refuse client connections, although continue working with the other APs.
- Copy Link
- Report Inappropriate Content
Hi @Byteguy,
Two questions:
1) you mention Cat5 a lot. Did you really use Cat5 cable, or are you using this term generically? Generally Cat5 is not used any more since it does not support Gigabit. Generally 5E or better (6/6A) is used for new installs. I always use at least 6 myself (or 6a).
For outdoor runs, I would use 6A Shielded Twisted Pair with UV resistent jackets and grounded / metallic terminations -- but this is one of the most expensive options -- save for Plenum rated cable. You may also need boots, silicone, etc.to protect the terminations from water -- depending on how things are installed.
2) I'm still confused about the "disconnect" message you are seeing. It's still not clear to me.
A) are you saying that the AP gets disconnected / isolated from Omada?
OR
B) or I think in your later post you're staying that wireless clients are just getting bounced off of the root node 2.4GHz radio and pushed to other AP's?
If it's B, then there are a few possibilities:
- 2.4 GHz radio is set to auto channel and the channel changed or there was significant, peridoc interference on that channel.
- intentional changes to the AP settings (power, BW e.g. 20 vs 40, etc.) will also kick all of the users off of the radio onto other AP's.
- flakey hardware for that radio
But you are right, if the root node AP or its 5.8GHz radio was really going down, the entire wireless network would loose internet connectivity,
Similarly, all of the AP's need to have reliable access to the OC-200 to perform Fast Roaming and dynamic MESH reconfiguration. So if possible, I would move the OC-200 onto the PoE that feeds the root node and distal CPE.
-Jonathan
- Copy Link
- Report Inappropriate Content
@JSchnee21 Sorry 'bout that. It's CAT5e.
And, we've just located the problem. When I changed the IP of the root AP, all was well. Didn't know why.
Turns out that there was another tech quite a while ago that was configuring surveilance cameras. He had assigned the IP that was in question to a camera.
A few months ago, Comcast came in and swapped out their router, so all the settings were lost.
The new router, of course, had generic settings. The whole range of IPs were available for DHCP. I went in and took the first 29 addresses away from DHCP. But the cameras were still using one of the IPs. I was not even aware that the cameras existed. (There are 2 separate surveilance systems. I knew about one, but not the other).
When I set up the network to assign static IPs to the APs, I inadvertently assigned the same address as the one camera.
Today the guy who originally set up the cameras is at the marina trying to figure out why the cameras aren't working as they should. That's when I found out how he set it up. So I worked with him to reassign addresses to the APs. Now the network is fine, but he'll have to deal with the cameras!
At least I now know I'm not crazy! What a relief.
Thanks
Art
- Copy Link
- Report Inappropriate Content
Information
Helpful: 0
Views: 2663
Replies: 7
Voters 0
No one has voted for it yet.