组网及说明
两台S6850-56HF设备直连建立iBGP,使用BFD检测。
告警信息
%Jul 18 17:21:36:403 2023
SH-CY-A504-F05-HS6850-IL1-01 BFD/5/BFD_CHANGE_FSM:
Sess[100.127.41.0/100.127.41.1, LD/RD:130/130, Interface:WGE1/0/55,
SessType:Ctrl, LinkType:INET], Ver:1, Sta: INIT->UP, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:399 2023 SH-CY-A504-F05-HS6850-IL1-01 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.2/100.127.41.3, LD/RD:129/129, Interface:WGE1/0/56, SessType:Ctrl, LinkType:INET], Ver:1, Sta: INIT->UP, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:399 2023 SH-CY-A504-F05-HS6850-IL1-01 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.0/100.127.41.1, LD/RD:130/130, Interface:WGE1/0/55, SessType:Ctrl, LinkType:INET], Ver:1, Sta: DOWN->INIT, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:399 2023 SH-CY-A504-F05-HS6850-IL1-01 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.2/100.127.41.3, LD/RD:129/129, Interface:WGE1/0/56, SessType:Ctrl, LinkType:INET], Ver:1, Sta: DOWN->INIT, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:398 2023 SH-CY-A504-F05-HS6850-IL1-01 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.1 state has changed from OPENCONFIRM to ESTABLISHED.
%Jul 18 17:21:36:397 2023 SH-CY-A504-F05-HS6850-IL1-01 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.3 state has changed from OPENCONFIRM to ESTABLISHED.
%Jul 18 17:21:34:940 2023 SH-CY-A504-F05-HS6850-IL1-01 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: 100.127.41.1 state has changed from ESTABLISHED to IDLE. (Reason: a notification was received from the peer, error code: Receive Notificationcode 6/0, local interface: )
%Jul 18 17:21:34:940 2023 SH-CY-A504-F05-HS6850-IL1-01 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.1 state has changed from ESTABLISHED to IDLE for a notification received: Cease/ErrSubCode Unspecified.
%Jul 18 17:21:34:938 2023 SH-CY-A504-F05-HS6850-IL1-01 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.2/100.127.41.3, LD/RD:129/129, Interface:WGE1/0/56, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
%Jul 18 17:21:34:937 2023 SH-CY-A504-F05-HS6850-IL1-01 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: 100.127.41.3 state has changed from ESTABLISHED to IDLE. (Reason: a notification was received from the peer, error code: Receive Notificationcode 6/0, local interface: )
%Jul 18 17:21:34:937 2023 SH-CY-A504-F05-HS6850-IL1-01 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.3 state has changed from ESTABLISHED to IDLE for a notification received: Cease/ErrSubCode Unspecified.
%Jul 18 17:21:34:936 2023 SH-CY-A504-F05-HS6850-IL1-01 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.0/100.127.41.1, LD/RD:130/130, Interface:WGE1/0/55, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
%Jul 18 17:21:36:400 2023 SH-CY-A504-F04-HS6850-IL1-02 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.1/100.127.41.0, LD/RD:130/130, Interface:WGE1/0/55, SessType:Ctrl, LinkType:INET], Ver:1, Sta: INIT->UP, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:399 2023 SH-CY-A504-F04-HS6850-IL1-02 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.3/100.127.41.2, LD/RD:129/129, Interface:WGE1/0/56, SessType:Ctrl, LinkType:INET], Ver:1, Sta: INIT->UP, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:399 2023 SH-CY-A504-F04-HS6850-IL1-02 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.1/100.127.41.0, LD/RD:130/130, Interface:WGE1/0/55, SessType:Ctrl, LinkType:INET], Ver:1, Sta: DOWN->INIT, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:398 2023 SH-CY-A504-F04-HS6850-IL1-02 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.3/100.127.41.2, LD/RD:129/129, Interface:WGE1/0/56, SessType:Ctrl, LinkType:INET], Ver:1, Sta: DOWN->INIT, Diag: 0 (No Diagnostic)
%Jul 18 17:21:36:397 2023 SH-CY-A504-F04-HS6850-IL1-02 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.0 state has changed from OPENCONFIRM to ESTABLISHED.
%Jul 18 17:21:36:396 2023 SH-CY-A504-F04-HS6850-IL1-02 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.2 state has changed from OPENCONFIRM to ESTABLISHED.
%Jul 18 17:21:34:111 2023 SH-CY-A504-F04-HS6850-IL1-02 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: 100.127.41.0 state has changed from ESTABLISHED to IDLE. (Reason: a session down event was received from BFD, error code: Send Notificationcode 6/0, local interface: )
%Jul 18 17:21:34:111 2023 SH-CY-A504-F04-HS6850-IL1-02 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.0 state has changed from ESTABLISHED to IDLE for session down event received from BFD.
%Jul 18 17:21:34:109 2023 SH-CY-A504-F04-HS6850-IL1-02 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.1/100.127.41.0, LD/RD:130/130, Interface:WGE1/0/55, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
%Jul 18 17:21:34:005 2023 SH-CY-A504-F04-HS6850-IL1-02 BGP/5/BGP_STATE_CHANGED_REASON: BGP.: 100.127.41.2 state has changed from ESTABLISHED to IDLE. (Reason: a session down event was received from BFD, error code: Send Notificationcode 6/0, local interface: )
%Jul 18 17:21:34:005 2023 SH-CY-A504-F04-HS6850-IL1-02 BGP/5/BGP_STATE_CHANGED: BGP.: 100.127.41.2 state has changed from ESTABLISHED to IDLE for session down event received from BFD.
%Jul 18 17:21:34:003 2023 SH-CY-A504-F04-HS6850-IL1-02 BFD/5/BFD_CHANGE_FSM: Sess[100.127.41.3/100.127.41.2, LD/RD:129/129, Interface:WGE1/0/56, SessType:Ctrl, LinkType:INET], Ver:1, Sta: UP->DOWN, Diag: 1 (Control Detection Time Expired)
问题描述
同时出现BFD down并且联动触发BGP down。
过程分析
查看设备并未发现链路相关故障。
经分析诊断文档,发现该两台设备的版本都是6555P01。
在6635及之前的版本,BFD报文处理是有CPU完成的,如果有其他使用CPU处理的协议报文突然增加,导致CPU占用量突然升高,此时可能会造成BFD闪断并联动BGP down。
===============display version===============
H3C Comware Software, Version 7.1.070, Release 6555P01
Copyright (c) 2004-2019 New H3C Technologies Co., Ltd. All rights reserved.
H3C S6850-56HF uptime is 103 weeks, 1 day, 0 hours, 41 minutes
Last reboot reason : Cold reboot
解决方法
建议客户升级版本为6710加最新的补丁可以解决该问题。
客户提到过使用配置PTP-BFD或者INT-BFD的方式解决,但暂时不建议在当前版本使用硬件BFD ,因为相关的功能开发还需要完善。
现在的解决问题版本是单独绑定一个核给BFD进程,这样BFD不会因为其他进程导致震荡。
注意事项
BFD闪断会联动BGP断联。
升级解决问题版本时的注意事项。