Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection

Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection

Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection
Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection
2023-11-16 05:48:24 - last edited 2024-08-28 08:36:16

Background: 

 

In order to deal with more and more questions about Load Balancing, in this article, we will discuss the most asked questions and answer them with explanations.

 

This Article Applies to:

 

All routers with Load Balance function.

 

Application Scenario:

 

 

Contents:

 

>Common Issues and Explanations

 

1. Load Balancing does not switch clients back to Primary WAN properly.

2. Backup WAN shows "Offline".

3. Backup WAN has traffic flow while it is inactive.

4. How does Online Detection work? Where do I set the Online Detection in Controller mode?

5. Why there are "unauthorized" DNS resolutions and ping? Where do they come from?

6. Can I add up or aggregate my Internet speed by Load Balancing?

7. How can I achieve a cascade Backup? Primary > Secondary > Pis Aller(Last resort)

8. Is the Load Balancing effective on WIFI?

9. Does the Link Backup or Load Balancing detect slow speed?

>Note
>QA
>Update Log

 

Common Issues and Explanations:

 

1. Load Balancing does not switch clients back to Primary WAN properly.

 

In response to this issue, please make sure you have configured your router and load balancing correctly. Please refer to the guide: Troubleshooting Online Detection and Link Backup (Failover) Don't Take Effect

Based on the correct configuration in Load Balancing and Online Detection, your network should switch to the primary quickly. About this switch time to backup or primary, it depends on your Online Detection interval.

 

If you encounter an issue in which your Load Balancing and Failover cannot switch back or prioritize your Primary WAN. Please follow the steps below and paste your screenshots into a New Thread so I can discuss and troubleshoot with you.

 

Steps for verification:

 

1. Set up the Link Backup properly. (This applies to the Controller as well. The mechanism is the same.)

 

2. Make sure your both WANs work so that the Online Detection can work properly. Use a computer to tracert in Command Prompt(CMD) to learn about your route. You can unplug the WAN one by one and verify the Internet connection and routing tables.

 

 

3. To simulate the Primary WAN(WAN/LAN4) being down by setting an incorrect IP address to my Primary WAN.

(Do not recommend you unplug the Ethernet which is not testing the Online Detection. We want to make sure the Online Detection can detect and make your router switch to the Backup WAN.)

 

The yellow marker shows the incorrect WAN IP address. And the tracert shows my connection has been switched to the backup.

 

 

4. Set the Primary WAN(WAN/LAN4) to the correct Connection Type again. And monitor the connection by tracert again. Note that I ran two of CMD to rule out the error.

 

 

This is the end of the verification. If you experience an issue with this, please perform and follow the steps above. Screenshot your configuration and steps. Start a New Thread and our team will follow it up.

 

2. Backup WAN shows "Offline".

 

If you have enabled the Link Backup, AKA Failover, your backup WAN will remain inactive so it shows as "Offline" in the status.

 

Please note that when it is inactive, it will still count the packet. The statistics system counts the packet instead of the traffic. When it is inactive, there will still be a small number of packets like DHCP discovery and request(for maintaining a healthy WAN connection), DNS resolution, and time syncing flowing.

 

  Internet Status Connection Status
Ethernet Plugged In Up Online
Ethernet Plugged In Down Offline
Ethernet Plugged In - Link Up
Ethernet Plugged Out - Link Down

 

 

3. Backup WAN has traffic flow while it is inactive.

 

As mentioned previously:

Please note that when it is inactive, it will still count the packet. The statistics system counts the packet instead of the traffic. When it is inactive, there will still be a small number of packets like DHCP discovery and request(for maintaining a healthy WAN connection), DNS resolution, and time syncing flowing.

 

When it is inactive, it does not allow much traffic. The router should not use up all the plan or consume unusually.

ICMP used here is to detect the Internet availability.

DHCP discovery/request is plain.

 

Note that these Wireshark results might be different for different people. Pictures were remarked differently.

 

  • 23 hours ago, it showed this much data:

 

winservice.tp-link.com is a local service based on our local network. You should not see it in yours.

 

 

  • 23 hours later:

 

 

 

If you encounter an issue with this where your metered LTE plan was used up or consumed an unusual amount of data, please follow the steps below and paste your screenshots into a New Thread so I can discuss and troubleshoot with you.

 

Steps for troubleshooting:

 

1. Check if your Primary WAN was down before and if the backup was effective in keeping the Internet flowing while you were NOT noticing it. Please screenshot your Log in your new thread which shows the WAN Up or Down. Alternatively, our team will create a ticket for you and ask for the export of your log and review it based on your timeline when you find out about this issue.

 

 

2. Go to your LTE modem(router), examine if the WIFI is enabled. Disable it. You can view the DHCP client list and check if there are any clients using the WIFI from your LTE modem.

3. Wireshark. How to capture packets using Wireshark on SMB router or switch and How to Use Port Mirror to Capture Packets in the Controller

  • Mirror the WAN port and monitor your traffic from the WAN.
  • Leave your computer for hours or days to check if there is any traffic from the backup WAN(while inactive) on the Omada router. Share the screenshots in your newly created thread.

 

4. How does Online Detection work? Where do I set the Online Detection in Controller mode?

 

Online Detection is the name of the Internet availability test in Standalone mode. In the Controller mode, it is called Echo Server.

For any Internet detection IP or domain, please set a public one. We do NOT recommend you use a private IP address even though you can put a private one.

It continuously pings from the WAN, and by receiving the ICMP replies from the IP(server) or domain you've set to identify if your Internet is available or not.

 

 

5. Why there are "unauthorized" DNS resolutions and ping? Where do they come from?

 

Like said before, if you have set the Online Detection(Echo Server) in Controller mode, the router will ping the server IP you've set. So, you'll see ICMP in the Wireshark pinging the IP address.

 

 

It is normal to see only replies because the router has sent the requests, I am Wireshark from the LAN. Based on the knowledge that ICMP works by Request and Reply, the Internet actually works.

 

 

If you did not set the Online Detection manually, yet you have enabled the Link Backup, this will automatically enable the Online Detection.

In this situation, the router will use the default address to ping and test your Internet availability.

 

Additionally, the system writes four domains to be resolved to detect if your network is online or not: www.google.com, www.tp-link.com, www.ieee.org, www.w3.org.

 

6. Can I add up or aggregate my Internet speed by Load Balancing?

 

It is NOT adding up or aggregating your download or upload speed. Unless the server you are downloading from supports multiple sessions(terminology) while you have enabled Load Balancing and your multiple WANs are active(status as Online). In this situation, the background Load Balancing algorithm will balance the sessions between the WANs.

Again, the prerequisite is the server you are downloading from supports multiple sessions. This is only the possible scenario where you witness an aggregated speed.

If you simply want to test your multiple connections, you should refer to this FAQ: Why failing to achieve bandwidth aggregation of multiple WAN ports by Speedtest on SMB router

 

7. How can I achieve a cascade Backup? Primary > Secondary > Pis Aller(Last resort)

 

The default system does not support this failover setup. But we have a workaround to achieve this.

For example, you have three WANs in this situation.

1. Set WAN1 and WAN2 as the primary WANs. Set the backup WAN as WAN3.

 

 

2. Go to the Policy Routing, and set the ALL network route to the WAN1, mode as Priority.

 

 

8. Is the Load Balancing effective on WIFI?

 

Yes, of course. There is no reason it does not allow WIFI clients. The Load Balancing is set on the router. It applies to most features on the router if it is set on the router, it is layer 3(IP) based. Your network routes according to the IP.

So, if you have trouble with this, please follow the steps above in tracert, you can verify it based on those instructions, too.

 

9. Does the Link Backup or Load Balancing detect slow speed?

No.

Load Balancing does not detect the slow speed of your WANs. It balances based on the sessions.

Link Backup, as you already know Link Backup is based on Online Detection to switch between lines. So it should only switch when the primary fails instead of experiencing a slow speed.

 

Additionally, this is hard to achieve and may take up more performance if it is possible.

1. You have to know that if you require slow speed and switch over, the system will 24/7 detect the Internet speed of your WANs. Which means it runs the speed test of all of your WANs at the same time and in a 24/7 non-stop manner.

2. The speed test if you run, will take up all the speed. You cannot do a limited speed test which is also against the purpose of the speed test. Of course, this will slow your Internet connection.

We cannot, impossible to host a server for a limited speed test as well since the server fee would be enormous. All the speed test servers you see are not single servers. They profit from advertisements and we cannot provide such a service to give free, ad-free and speed servers for millions of users we have. 

3. The speed test will take up NAT throughput and the performance of a device is limited. Not to mention the performance will be saved for the speed test that runs 24/7 and non-stop forwarding the packets. The overall performance or advertised performance cannot be reached.

 

10. Why do the WAN statistics show differently from the preset ratio?

 

You should check if you have enabled Application Optimized Routing. If you do, then please kindly read the User Guide or the explanation next to the feature. Here's what it writes: With Application Optimized Routing enabled, the router will consider the source IP address and destination IP address (or destination port) of the packets as a whole and record the WAN port they pass through. Then the packets with the same source IP address and destination IP address (or destination port) will be forwarded to the recorded WAN port. This feature ensures that multi-connected applications work properly. This helps avoid potential issues such as source IP changes or packet reordering that can occur due to multi-WAN traffic splitting.

 

So, you should consider if the traffic is repeated based on the IP. For example, if you watch a movie from the streaming platform, you should not be load balanced and you should enable the Application Optimized Routing.

In this case, the traffic will only routed through a single WAN. In this case, when you look at the traffic statistics, you should understand why that there is a difference.

 

Note:

 

1. We cannot make a zero-feeling (insensible) failover when switching to the other WAN when the network fails. Due to the differences in TCP and UDP, we cannot make it happen. Sometimes, you have to reset the session by refreshing the page.

2. If you are in standalone, you cannot set the Online Detection period, unlike the Controller mode. In this case, you may face a few seconds of downtime before it detects the offline and switch over. We have optimized in previous firmware updates and this is acceptable in our opinion. 

 

QA:

 

Q1: In Online Detection Auto mode, does it detect the online status by pinging the DNS server or by sending DNS lookup queries? What are the domain names being queried?

 

A1: In Auto mode, detection is done by sending DNS lookup queries. The following 4 domain names are queried: www.google.comwww.tp-link.comwww.ieee.org, and www.w3.org. Successful resolve means your network is online. Vice versa.

 

Q2: How many times does the router send detection packets in Online Detection Auto mode before considering the WAN port disconnected?

A2: In each detection, each WAN port will individually send DNS lookup packets to 4 domains. If there are no replies, it is considered that the WAN port has disconnected.

 

Q3: How long does it take for the router to switch to Backup WAN after determining that Primary WAN is disconnected?

A3: Based on current test results, it takes about 30s-60s(Standalone mode). This duration is related to the number of WAN ports and the interval of online checks. Under link back functionality, the status of the backup WAN is offline and it requires online check to detect the Primary WAN as offline before enabling the Backup WAN.

 

Q4: Can Ping test and DNS query(detection) be performed simultaneously in Online Detection Manual mode? What mechanism determines if a WAN port is online or offline? Is it necessary for both Ping and DNS queries to fail in order to consider a connection is down?

 

A4: In Manual mode, the Ping test and DNS query can be performed simultaneously. Properly set the Ping IP address, and set a DNS Lookup server and it will work as expected.

 

The priority is given to the DNS domain queries. If all 4 domain names fail during a DNS query, then the Ping test will be conducted to cross-check.

If a successful response is received during DNS query in the 4  domain names, the Ping test will not be carried out.

 

For DNS check - if an IP address associated with any one of 4 domain names can be resolved successfully, then it's considered online; if none of them are detected successfully then it's considered offline. But then it starts to ping and check the online status.

4 domain names have higher priority than the Ping test.

For ping - if an IP address can be reached through the ping command successfully then its' considered online; otherwise its' considered offline.

 

Q5: How many times does Online Detection Manual mode perform tests before considering a connection dropped?

A5: In Manual mode, testing occurs based on configured interval time for the online check process. During each test when either the specified IP address cannot be pinged or the default domain name's DNS lookup fails, the WAN port is directly set as offline.

If the Internet is disconnected after the interval, it'll set WAN to offline in the next test and this might cause a delay.

 

Q6: In Online Detection Manual mode, can we specify a DNS query domain name instead of just a DNS server address?

A6: Currently, it does not support specifying a DNS query domain name(DoH or DoT). It is a DNS Lookup server with an IP address.

 

Q7: What are the use cases for Always Online mode?

A7: The use case may arise in internal LAN environments where DNS queries cannot be resolved. It has also been observed that some ISP service providers block specific DNS queries made by the user gateway, such as for the domain name www.google.com, resulting in offline detection.

 

Q8: Does Backup WAN perform Online Detection when Primary WAN is online? After switching to Backup WAN due to Primary being disconnected, does Primary continue with Online Detection and automatically switch back once it detects that Primary is online again?

A8: Backup WAN does NOT perform Online Detection when Primary WAN is online.

After switching to Backup WAN due to the disconnection of Primary WAN, Primary WAN continues with the Online Detection test. Once it detects that Primary WAN is back online, automatic switchback occurs.

 

Q9: Are there any differences in Online Detection and Link Backup mechanisms between Controller mode and Standalone mode?

A9:

In Controller mode:

1. The time interval for Online Detection can be modified on the Controller.

2. Online Detection under the Controller setup, ping test(AKA Echo Server) can fill an IP or hostname; in standalone mode, currently supports only IP filling.

3. Online Detection on the Controller side, it's not possible to specify a DNS query server for DNS checks but only the ping test(Echo Server) is available.

4. The "Always Online" option is NOT available on the controller.

5. There are two backup modes (backup modes referring to link backup), which are present on the controller but not available in a standalone setup.

 

Mode 1:

 

Mode 2:

 

Update Log:

 

Aug 28th, 2024:

Add contents.

 

Jun 20th, 2024:

Add contents - Explanation 9.

 

Mar 11th, 2024:

Update the format.

 

Jan 11th, 2024:

Update the format for a better reading experience.

Optimize the QA for better referring.

 

Jan 10th, 2024:

Update the title.

 

Dec 20th, 2023:

Add QA section.

Update Contents.

 

Recommended Threads:

 

Troubleshooting Online Detection and Link Backup (Failover) Don't Take Effect

Troubleshooting Custom DDNS Is Not Working

 

Feedback:

 

  • If this was helpful, welcome to give us Kudos by clicking the upward triangle below.
  • If there is anything unclear in this solution post, please feel free to comment below.
  • If you encounter such an issue, please follow the troubleshooting above to check your settings. Besides, ensure your Omada Controller and Gateway are running with the latest firmware.
  • If the issue still exists after you try the suggestion above, please feel free to comment below or contact our support team with a detailed description of your issue and the steps you have tried.

 

Thank you in advance for your valuable feedback!

 

------------------------------------------------------------------------------------------------

Have other off-topic issues to report? 

Welcome to > Start a New Thread < and elaborate on the issue for assistance.

 

Best Regards! If you are new to the forum, please read: Howto - A Guide to Use Forum Effectively. Read Before You Post. Look for a model? Search your model NOW Official and Beta firmware. NEW features! Subscribe for the latest update!Download Beta Here☚ ☛ ★ Configuration Guide ★ ☚ ☛ ★ Knowledge Base ★ ☚ ☛ ★ Troubleshooting ★ ☚ ● Be kind and nice. ● Stay on the topic. ● Post details. ● Search first. ● Please don't take it for granted. ● No email confidentiality should be violated. ● S/N, MAC, and your true public IP should be mosaiced.
  10      
  10      
#1
Options
1 Reply
Re:Common Questions About the Load Balancing, Link Backup(Failover) & Online Detection
2024-08-29 08:03:11

  @Clive_A 

Great topic! Load balancing and link backup (failover) are essential for maintaining a reliable network, especially in environments where uptime is critical. By distributing traffic across multiple connections, load balancing ensures that no single link is overwhelmed, which helps maintain performance. Meanwhile, failover automatically switches to a backup connection if the primary one fails, keeping your network online without interruption.

In terms of online detection, it’s crucial to have a robust system in place to monitor the health of each link. This ensures that the failover process is seamless and minimizes downtime.

I’ve explored these concepts in more detail on my website, Ticketmaster phone number, where I share tips and tutorials on optimizing network performance. If you're looking to dive deeper into how to set up load balancing and failover effectively, feel free to check it out!

  0  
  0  
#5
Options