TL-SG108E arp flood problem
This thread has been locked for further replies. You can start a new thread to share your ideas or ask questions.
TL-SG108E arp flood problem
Model : TL-SG108E
Hardware Version : TL-SG108E 2.0 (also had the problem on hardware version 1.0, the switch was swapped at the store since the issue could not be found)
Firmware Version : 1.0.1 Build 20160108 Rel. 57851 (also tested multiple firmware releases on the hardware 1.0)
ISP :
It seems that when the switch receives too many arp requests, these are not forwarded anymore.
To explain in detail, let me explain my setup at home:
ISP Modem --- (eth0) Server (eth1) --- TP-Link --- Wireless bridge / Digibox
So this server which acts as a NAT router, NAS and media server has 2 ethernet interfaces: eth0 is connected to my cable (coax) Internet Providers modem
The 2nd interface of the server (eth1) is connected to port 1 on the TL-SG108E.
On the default VLAN (id 1) the LAN connection is provided, which is a private subnet behind a NAT by the server.
On port 2, a wireless bridge is connected which shares this VLAN to wireless devices.
Also on port 1 is a tagged VLAN (id 2) which bridges the WAN from the ISP modem (as seen on eth0 at the server) to port 8 (untagged) on which the Digibox is connected.
This digibox is provided by the Internet Provider and requires an ethernet connection for on-demand streaming.
The modem provides 2 networks:
- All digiboxes are registered with the provider, so if a MAC address of a digibox is seen, it'll receive an ip in the 10.0.0.0/8 range
- Any other device will get a public internet IP (real public IP, no NAT)
What I noticed for a few months was that my local devices were losing their network connection. From a tcpdump I quickly noticed that a lot of arp requests were sent out to try to re-establish the communication, however I couldn't really find the root cause for it. As an interim solution I've replaced the TL-SG108E a few times with another switch because the frequent disconnects which last about 30 seconds were really annoying.
As I was looking at the problem from the LAN side, I didn't seem to find the root cause. During some troubleshooting today however, I noticed a lot of arp traffic on my modem which is bridged to the switch on vlan 2.
Almost all of the arp traffic are broadcasts from the ISP headend to map IP's to mac addresses. All of these have the same source MAC address and sent out as broadcast. I see about 30 arp requests per second passing through the interface.
When I added an arptables rule on the server to filter out all arp traffic from vlan 2, not destined for my digibox all the problems on the switch go away.
This seems to suggest there's some rate limiting of arp requests going on in the switch, allthough I don't see any settings to do that (and in the past I've also reproduced the issue with the switch reset to factory defaults).
I'm guessing that my problem started after the ISP decided to put more subnets on the same layer 2 segment, causing an increase in the amount of arp requests.
As the arp filtering is the only thing I've changed to fix the issue (the wan is still on tagged vlan 2 and except for the arp all traffic still gets through) I'm rather sure that's the root problem with the switch. I was wondering if someone could provide more information on this.
Hardware Version : TL-SG108E 2.0 (also had the problem on hardware version 1.0, the switch was swapped at the store since the issue could not be found)
Firmware Version : 1.0.1 Build 20160108 Rel. 57851 (also tested multiple firmware releases on the hardware 1.0)
ISP :
It seems that when the switch receives too many arp requests, these are not forwarded anymore.
To explain in detail, let me explain my setup at home:
ISP Modem --- (eth0) Server (eth1) --- TP-Link --- Wireless bridge / Digibox
So this server which acts as a NAT router, NAS and media server has 2 ethernet interfaces: eth0 is connected to my cable (coax) Internet Providers modem
The 2nd interface of the server (eth1) is connected to port 1 on the TL-SG108E.
On the default VLAN (id 1) the LAN connection is provided, which is a private subnet behind a NAT by the server.
On port 2, a wireless bridge is connected which shares this VLAN to wireless devices.
Also on port 1 is a tagged VLAN (id 2) which bridges the WAN from the ISP modem (as seen on eth0 at the server) to port 8 (untagged) on which the Digibox is connected.
This digibox is provided by the Internet Provider and requires an ethernet connection for on-demand streaming.
The modem provides 2 networks:
- All digiboxes are registered with the provider, so if a MAC address of a digibox is seen, it'll receive an ip in the 10.0.0.0/8 range
- Any other device will get a public internet IP (real public IP, no NAT)
What I noticed for a few months was that my local devices were losing their network connection. From a tcpdump I quickly noticed that a lot of arp requests were sent out to try to re-establish the communication, however I couldn't really find the root cause for it. As an interim solution I've replaced the TL-SG108E a few times with another switch because the frequent disconnects which last about 30 seconds were really annoying.
As I was looking at the problem from the LAN side, I didn't seem to find the root cause. During some troubleshooting today however, I noticed a lot of arp traffic on my modem which is bridged to the switch on vlan 2.
Almost all of the arp traffic are broadcasts from the ISP headend to map IP's to mac addresses. All of these have the same source MAC address and sent out as broadcast. I see about 30 arp requests per second passing through the interface.
When I added an arptables rule on the server to filter out all arp traffic from vlan 2, not destined for my digibox all the problems on the switch go away.
This seems to suggest there's some rate limiting of arp requests going on in the switch, allthough I don't see any settings to do that (and in the past I've also reproduced the issue with the switch reset to factory defaults).
I'm guessing that my problem started after the ISP decided to put more subnets on the same layer 2 segment, causing an increase in the amount of arp requests.
As the arp filtering is the only thing I've changed to fix the issue (the wan is still on tagged vlan 2 and except for the arp all traffic still gets through) I'm rather sure that's the root problem with the switch. I was wondering if someone could provide more information on this.