Current Events > Holy shit... night time maintenance went wrong pretty badly

Topic List
Page List: 1
CableZL
03/25/21 4:59:45 AM
#1:


We have a huge project going on to replace old network equipment at a bunch of sites. Tonight we only had 2 switch stacks to replace, but the 2nd one was a bit more complicated. We normally just pre-configure the new switches, ship them to the site, take all the old switches out and then put all the new switches in.

The old switches are in their end of life phase, which is why we're having to do this. What happened with the 2nd switch stack prior to tonight:
  • Old switch 1: Cisco 3750s
  • Two new switches were added that are Aruba 2930Ms. This is the vendor we're going with for the access layer now. Because they're different vendors, the new switches couldn't be joined into the old switch. So they were daisy chained.
Work tonight:
  • Replaced the 3750. This went just fine
  • Factory reset the other 2 Aruba 2930Ms and then joined them to the new replacement switch. THEY. WOULD. NOT. COME. UP. IN. THE. RIGHT. ORDER.


Stack physical toplogy:

SWITCH1 - should be stack member 1
SWITCH2 - should be stack member 2
SWITCH3 - should be stack member 3

Except the software was saying switch 2 was member 1, switch 3 was member 2, and switch 1 was member 3. WTF

Tried different things for about an hour and a half, but couldn't figure it out. Called Aruba TAC. Worked with them for 2 hours and got it figured out, finally.

What we had to do:
  • Allow the switch stack to come up however it wanted
  • Console into the commander (primary) switch (it was switch 2 at this point, and it was the correct member ID)
  • Remove switches 1 and 3 from the configuration
  • Configure switches 1 and 3 as expected
  • Connect switch 1 to the stack and boot it up
  • Do a "redundancy switchover" to make switch 1 the commander
  • Boot up switch 3
  • Switch 3 booted up as the correct member ID


Hooly shit

I'm sleepy AF, lol

---
... Copied to Clipboard!
#2
Post #2 was unavailable or deleted.
CableZL
03/25/21 5:01:32 AM
#3:


LivingLegend posted...
Too long; didnt read.
tl:dr added, lol

---
... Copied to Clipboard!
CableZL
03/25/21 5:09:22 AM
#4:


Got it done with about 30 minutes to spare before we would have to start rolling that part of the change back. Whew. Just gotta do clean up stuff and then I'm going to sleep

---
... Copied to Clipboard!
Johnny_Nutcase
03/25/21 5:16:23 AM
#5:


What routing protocol did you use? I've been dicking around with VLAN OSPF, RiPV2. Eigrp

---
I got a freakin muscle spasm in my back the gear slipped the air brakes were shot to hell. I mean there was nothing I could do, boom right into the post office
... Copied to Clipboard!
nfearurspecimn
03/25/21 5:16:57 AM
#6:


I don't know a TON about networking (only took 2 classes in HS), but why does the order matter?

---
Wake up. You have to wake up. *currently a preta/hungry ghost*
Dai Grepher: I was wrong. My entire theory is incorrect. Zero Mission IS a remake of Metroid.
... Copied to Clipboard!
CableZL
03/25/21 5:21:11 AM
#7:


nfearurspecimn posted...
I don't know a TON about networking (only took 2 classes in HS), but why does the order matter?
In the configuration, the settings for each port in this stack are specific to the stack member number and port number. Member 1 port 1 is 1/1. Member 2 port 1 is 2/1.

So, if we have a tech or a vendor dispatch to a site to plug in a device on member 1 port 1 (the switch that's on top), but they're actually plugging into member 3 port 1 (switch on top is member instead), then we on the network team would be configuring the wrong port and it would take a bit to realize we're talking about two different ports. There is also the risk that it would break connectivity to a critical server or wireless access point when the configuration is changed incorrectly.

---
... Copied to Clipboard!
CableZL
03/25/21 5:23:33 AM
#8:


Johnny_Nutcase posted...
What routing protocol did you use? I've been dicking around with VLAN OSPF, RiPV2. Eigrp
I'm our environment, we use EIGRP and BGP in the data centers and BGP at the "edge," which includes any location that is not at the data centers

---
... Copied to Clipboard!
Johnny_Nutcase
03/25/21 5:26:24 AM
#9:


You ever fuck with route distribution? That always screwed me up. Inside local, outside global.... ohhh man.

I'm also thinking of NAT/PAT

---
I got a freakin muscle spasm in my back the gear slipped the air brakes were shot to hell. I mean there was nothing I could do, boom right into the post office
... Copied to Clipboard!
CableZL
03/25/21 5:34:03 AM
#10:


Johnny_Nutcase posted...
You ever fuck with route distribution? That always screwed me up. Inside local, outside global.... ohhh man.

I'm also thinking of NAT/PAT
Yeah, route redistribution is complicated, but it's not hard. We do a lot of that in our environment. Once you get it down, you got it. Just gotta keep using the skill so you don't lose it

NAT/PAT is also complicated, but it's another thing where one you got it, you got it.

---
... Copied to Clipboard!
CableZL
03/25/21 5:37:17 AM
#11:


Also, this problem highlights a big difference between Cisco switching and Aruba switching.

If you have a Cisco switch stack and they have wrong member IDs, you can literally just do something like switch 4 renumber 8. Then just reboot the stack and repeat until the order is correct. With Aruba, you essentially have to remove the switch from the stack, configure it at the right member number, then physically reboot it because you can't control it once you remove it from the stack. Can't really do that without having someone on site unless you have remote console access set up. We have remote console for configuring stuff before shipping it out, but not for most access switches.

---
... Copied to Clipboard!
Johnny_Nutcase
03/25/21 5:41:09 AM
#12:


I thought that was the point of route redistribution. You could set your router to translate another routers shit.

---
I got a freakin muscle spasm in my back the gear slipped the air brakes were shot to hell. I mean there was nothing I could do, boom right into the post office
... Copied to Clipboard!
CableZL
03/25/21 5:43:29 AM
#13:


Johnny_Nutcase posted...
I thought that was the point of route redistribution. You could set your router to translate another routers shit.
Yeah, but routing is all layer 3 stuff. Switch stacks involve layers 1 and 2 only.

You wouldn't do route redistribution on a layer 2 switch. That's for routers, some firewalls, and layer 3 switches.

---
... Copied to Clipboard!
CableZL
03/25/21 3:38:56 PM
#14:


... Copied to Clipboard!
Johnny_Nutcase
03/25/21 7:04:51 PM
#15:


What do you do if you can't fix something? Is there another Network engineer to bounce ideas off of? I always worried about that being in a position where I AM in charge and it's still a problem I can't figure out.

---
I got a freakin muscle spasm in my back the gear slipped the air brakes were shot to hell. I mean there was nothing I could do, boom right into the post office
... Copied to Clipboard!
Complete_Idi0t
03/25/21 7:05:37 PM
#16:


So is that why the ship got stuck in the canal?
... Copied to Clipboard!
nothanks1
03/25/21 7:08:22 PM
#17:


All I can understand is: if the entire network in the school went down it wouldn't be that bad since it's in the computer policy that teachers cannot expect all computers to work and must prepare non-digital lessons* (so like if you're doing a class on computer programming then duh you need them and it's a free class).
Once my time is done for the day I am under no obligation to respond to any communication and if a teacher ever does manage to contact my private cell phone then they get a visit from the HR people
... Copied to Clipboard!
CableZL
03/25/21 7:11:58 PM
#18:


Johnny_Nutcase posted...
What do you do if you can't fix something? Is there another Network engineer to bounce ideas off of? I always worried about that being in a position where I AM in charge and it's still a problem I can't figure out.
Yeah, we have a team of people. If you're the only network engineer, the company needs to make sure they pay to have full support on all of their network equipment so that you can always rely on technical assistance from the vendor.

---
... Copied to Clipboard!
Topic List
Page List: 1