FRR says “Finite State Machine Error” for IPv6 BGP on Linux

If you turn off link-local addresses for the interface (those horrible 169.254.50.231 type addresses, and also the IPv6 fe80::dead:beef:4:f00d/64 things), FRR goes sulky. In netplan, it looks like this:

FSM
The official mascot of the BGP Finite State Machine (FSM)
ethernets:
  eth0:
    link-local: [ ]

When FRR is sulky, it says it will advertise IPv6 subnets, but its doesn’t actually advertise the routes, because it doesn’t actually connect to its peer(s). Here says it will advertise, but it doesn’t:

# show ip bgp all
...
Network Next Hop Metric LocPrf Weight Path
*> 2000:deaf:7012:feed::/64 :: 0 32768 i

tcpdump says this, which doesn’t help much:

# tcpdump -i any port bgp
17:27:49.667454 eth0  Out ifindex 3 52:54:00:a1:34:67 ethertype IPv6 (0x86dd), length 113: (class 0xc0, flowlabel 0xdde41, hlim 1, next-header TCP (6) payload length: 53) 2000:deaf::1194.45245 > 2000:deaf::80.179: Flags [P.], cksum 0x445a (incorrect -> 0xdc08), seq 1:22, ack 1, win 507, options [nop,nop,TS val 3099884752 ecr 3196106109], length 21: BGP
	Notification Message (3), length: 21, Finite State Machine Error (5) subcode Unspecified Error (0)

To turn on debug, it’s a few easy and obvious commands:

# vtysh
config term
debug bgp as4
debug bgp bfd
debug bgp flowspec
debug bgp keepalives
debug bgp neighbor-events
debug bgp pbr
debug bgp updates
debug bgp vpn
debug bgp bestpath
debug bgp evpn
debug bgp graceful-restart
debug bgp labelpool
debug bgp nht
debug bgp update-groups
debug bgp vnc
debug bgp zebra
log syslog debug
end
terminal monitor

And then the terminal displays all the debug you could want, including the confession below that FRR’s BGP on IPv6 cannot function without a link-local address. I suppose some IPv6 routers don’t like routing without a Link-Local to talk to … or it’s in the standard, or something stupid:

.336 [DEBG] bgpd: [YTARA-Q9ZD1] [Event] BGP connection from host 2000:deaf::80 fd 27
.336 [DEBG] bgpd: [JYX7T-SMTAQ] bgp_peer_gr_init called ..
.336 [DEBG] bgpd: [YB7SD-E2DZS] [BGP_GR] Peer state changed  --to-->  : 4 : !
.336 [DEBG] bgpd: [VTCGN-KEKBQ] bgp_peer_gr_flags_update [BGP_GR] called !
.336 [DEBG] bgpd: [VBM1Z-TD8QM] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART_HELPER : Set : !
.336 [DEBG] bgpd: [M49N5-G8MHD] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART : UnSet : !
.336 [DEBG] bgpd: [SRJ2F-0FBJK] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART_GLOBAL_INHERIT : Set : !
.336 [DEBG] bgpd: [VTCGN-KEKBQ] bgp_peer_gr_flags_update [BGP_GR] called !
.336 [DEBG] bgpd: [VBM1Z-TD8QM] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART_HELPER : Set : !
.336 [DEBG] bgpd: [M49N5-G8MHD] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART : UnSet : !
.336 [DEBG] bgpd: [SRJ2F-0FBJK] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART_GLOBAL_INHERIT : Set : !
.336 [DEBG] bgpd: [WNKP5-SN018] Found existing bnc 2000:deaf::1/128(0)(VRF default) flags 0xf ifindex 0 #paths 0 peer 0x7ebfd9c26031
.336 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Active established_peers 0
.336 [DEBG] bgpd: [ZQHFG-DQGX1] 2000:deaf::1 went from Idle to Active
.337 [DEBG] bgpd: [ZWCSR-M7FG9] 2000:deaf::1 [FSM] TCP_connection_open (Active->OpenSent), fd 27
.337 [WARN] bgpd: [ZM2F8-MV4BJ][EC 33554509] Interface: eth0 does not have a v6 LL address associated with it, waiting until one is created for it
.337 [ERR!] bgpd: [M3MYP-BVWDS][EC 33554460] 2000:deaf::1: nexthop_set failed, resetting connection - intf eth0
.337 [ERR!] bgpd: [NQGZV-Y3W62][EC 100663299] bgp_connect_success: bgp_getsockname(): failed for peer 2000:deaf::1, fd 27
.337 [INFO] bgpd: [HZN6M-XRM1G] %NOTIFICATION: sent to neighbor 2000:deaf::1 5/0 (Neighbor Events Error/Unspecific) 0 bytes
.337 [DEBG] bgpd: [VTCGN-KEKBQ] bgp_peer_gr_flags_update [BGP_GR] called !
.337 [DEBG] bgpd: [VBM1Z-TD8QM] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART_HELPER : Set : !
.337 [DEBG] bgpd: [M49N5-G8MHD] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART : UnSet : !
.337 [DEBG] bgpd: [SRJ2F-0FBJK] [BGP_GR] Peer 2000:deaf::1 Flag PEER_FLAG_GRACEFUL_RESTART_GLOBAL_INHERIT : Set : !
.337 [DEBG] bgpd: [ZWCSR-M7FG9] 2000:deaf::1 [FSM] BGP_Stop (Active->Idle), fd 27
.339 [DEBG] bgpd: [T91AW-FGMHW] bgp_fsm_change_status : vrf default(0), Status: Deleted established_peers 0
.339 [DEBG] bgpd: [ZQHFG-DQGX1] 2000:deaf::1 went from Active to Deleted

To turn on link-local for 

And the problem is fixed by turning on the (rather default) link-local address for IPv6. In netplan, it’s this part:

ethernets:
  eth0:
    link-local: [ ipv6 ]

And then it works.

This entry was posted in Stuff and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *