Hello,
i'm having a huge problem that i'm finding near impossible to crack:
on a large spine-leaf netwok i have 3 core PC6224 switches connected via their 10Gbit ports(there's no STP/mesh/nothing, a single connection).
All the gigabit ports of those switches in turn connect to distribution switches
thing is that during DDoS attacks or "high" traffic conditions(relative.., it happens when there's ~1.5/2Gbit/s on a 10Gbit port) all the sudden a server in one switch(gig port) starts losing packets to the server in the 10Gb port(the one with traffic), a simple ping command shows lost packets.
Viceversersa as well, and a ping from the 10gbe server to the switch virtual IP also shows dropped packets.
this shouldn't be happening at all, utilization is nowhere near HALF of peak!.
switches are configured default essentially, there's no VLAN, no routing, no qos, flow control enabled globally
i've checked statistics using RMON(sadly PC6224 lacks a simple "show how many BW is used RIGHT NOW in realtime in every port") and all i see are several thousand pause frames on the XG interfaces and GB but nothing alarming(like 20K pause frames in 10 millon packets).
Port errors are also very very low and happen on the gigabit ports mainly.
Also something curious i've found in the address tables is that some ports(specially the 10gbe ones) have a LARGE number of consecutive MAC addresses, that smells like an ARP attack:
aa:00:00:15:00:04 | 1/xg1 | |||
VLAN 1 | aa:00:00:15:00:05 | 1/xg1 | ||
VLAN 1 | aa:00:00:15:00:06 |
Can this be nuking the switches?
how do i prevent that?, i can't use portsec as this is not a "closed" network(more like a large ISP) and every port connects to another switch, since there's no DHCP i can't use dhcp spoofing or anything of the sort...
ideas?, i'm about to turn off flow control globally in all switches to see if that makes a difference