Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: (minor) Sometimes "Type" in interfaces status (from eeprom) is not populated #45

Open
wdoekes opened this issue Nov 21, 2024 · 0 comments

Comments

@wdoekes
Copy link
Member

wdoekes commented Nov 21, 2024

Description

root@spine2:0:~# show interface status Ethernet0-8

  Interface                    Lanes    Speed    MTU    FEC        Alias    Vlan    Oper    Admin    Type    Asym PFC
-----------  -----------------------  -------  -----  -----  -----------  ------  ------  -------  ------  ----------
  Ethernet0  73,74,75,76,77,78,79,80     400G   9100     rs  Eth1(Port1)  routed      up       up     N/A         N/A
  Ethernet8  65,66,67,68,69,70,71,72     400G   9100     rs  Eth2(Port2)  routed      up       up     N/A         N/A

vs

  Interface                            Lanes    Speed    MTU    FEC              Alias    Vlan    Oper    Admin                                             Type    Asym PFC
-----------  -------------------------------  -------  -----  -----  -----------------  ------  ------  -------  -----------------------------------------------  ----------
  Ethernet0          73,74,75,76,77,78,79,80     400G   9100     rs          Ethernet0  routed      up       up  QSFP-DD Double Density 8X Pluggable Transceiver         N/A
  Ethernet8          65,66,67,68,69,70,71,72     400G   9100     rs   fourHundredGigE2  routed      up       up  QSFP-DD Double Density 8X Pluggable Transceiver         N/A

While the values are available:

root@spine2:0:~# hd /sys/bus/i2c/devices/25-0050/eeprom  | grep QDD
00000090  20 64 9d 99 51 44 44 2d  34 30 30 47 2d 50 43 30  | d..QDD-400G-PC0|
000000e0  31 31 30 35 34 33 46 53  51 44 44 2d 34 30 30 47  |110543FSQDD-400G|

But transceivers eeprom does not think so:

root@spine2:0:~# show interfaces transceiver eeprom Ethernet0
Ethernet0: SFP EEPROM Not detected

Might be related to:

2024 Nov 21 13:23:37.946372 spine2 ERR pmon#xcvrd[29]: Xcvrd: exception found at child thread CmisManagerTask due to KeyError(None)
2024 Nov 21 13:23:37.946417 spine2 ERR pmon#xcvrd[29]: Exiting main loop as child thread raised exception!
2024 Nov 21 13:23:39.332537 spine2 ERR pmon#xcvrd: Exception occured at CmisManagerTask thread due to KeyError(None)
2024 Nov 21 13:23:39.334767 spine2 ERR pmon#xcvrd: Traceback (most recent call last):
2024 Nov 21 13:23:39.334807 spine2 ERR pmon#xcvrd:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1509, in run
2024 Nov 21 13:23:39.334807 spine2 ERR pmon#xcvrd:     self.task_worker()
2024 Nov 21 13:23:39.334855 spine2 ERR pmon#xcvrd:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 1167, in task_worker
2024 Nov 21 13:23:39.334855 spine2 ERR pmon#xcvrd:     port_change_observer.handle_port_update_event()
2024 Nov 21 13:23:39.334855 spine2 ERR pmon#xcvrd:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd_utilities/port_event_helper.py", line 200, in handle_port_update_event
2024 Nov 21 13:23:39.334915 spine2 ERR pmon#xcvrd:     self.port_change_event_handler(port_change_event)
2024 Nov 21 13:23:39.334915 spine2 ERR pmon#xcvrd:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 741, in on_port_update_event
2024 Nov 21 13:23:39.334915 spine2 ERR pmon#xcvrd:     self.force_cmis_reinit(lport, 0)
2024 Nov 21 13:23:39.334915 spine2 ERR pmon#xcvrd:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 901, in force_cmis_reinit
2024 Nov 21 13:23:39.334989 spine2 ERR pmon#xcvrd:     self.update_port_transceiver_status_table_sw_cmis_state(lport, CMIS_STATE_INSERTED)
2024 Nov 21 13:23:39.334989 spine2 ERR pmon#xcvrd:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd.py", line 683, in update_port_transceiver_status_table_sw_cmis_state
2024 Nov 21 13:23:39.334989 spine2 ERR pmon#xcvrd:     status_table = self.xcvr_table_helper.get_status_tbl(asic_index)
2024 Nov 21 13:23:39.334989 spine2 ERR pmon#xcvrd:                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024 Nov 21 13:23:39.335013 spine2 ERR pmon#xcvrd:   File "/usr/local/lib/python3.11/dist-packages/xcvrd/xcvrd_utilities/xcvr_table_helper.py", line 53, in get_status_tbl
2024 Nov 21 13:23:39.335013 spine2 ERR pmon#xcvrd:     return self.status_tbl[asic_id]
2024 Nov 21 13:23:39.335067 spine2 ERR pmon#xcvrd:            ~~~~~~~~~~~~~~~^^^^^^^^^
2024 Nov 21 13:23:39.335067 spine2 ERR pmon#xcvrd: KeyError: None
2024 Nov 21 13:23:39.335177 spine2 ERR pmon#xcvrd[54]: Xcvrd: exception found at child thread CmisManagerTask due to KeyError(None)
2024 Nov 21 13:23:39.335255 spine2 ERR pmon#xcvrd[54]: Exiting main loop as child thread raised exception!
2024 Nov 21 13:23:40.018515 spine2 ERR snmp#snmp-subagent [ax_interface] ERROR: MIBUpdater.start() caught an unexpected exception during update_data()#012Traceback (most recent call last):#012  File "/usr/local/lib/python3.11/dist-packages/ax_interface/mib.py", line 48, in start#012    self.update_data()#012  File "/usr/local/lib/python3.11/dist-packages/sonic_ax_impl/mibs/vendor/cisco/ciscoSwitchQosMIB.py", line 105, in update_data#012    namespace = self.port_index_namespace[int(port_index)]#012                ~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^#012KeyError: 251

Could be because of Breakout switch?

Restarting pmon did not help...

2024-11-21 13:34:27,983 WARN exited: xcvrd (terminated by SIGKILL; not expected)
2024-11-21 13:34:29,007 INFO spawned: 'xcvrd' with pid 54
2024-11-21 13:34:29,374 WARN exited: xcvrd (terminated by SIGKILL; not expected)
2024-11-21 13:34:31,419 INFO spawned: 'xcvrd' with pid 62
2024-11-21 13:34:31,780 WARN exited: xcvrd (terminated by SIGKILL; not expected)
2024-11-21 13:34:34,812 INFO spawned: 'xcvrd' with pid 70
2024-11-21 13:34:35,166 WARN exited: xcvrd (terminated by SIGKILL; not expected)
2024-11-21 13:34:35,166 INFO gave up: xcvrd entered FATAL state, too many start retries too quickly
2024-11-21 13:34:38,239 INFO success: psud entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
2024-11-21 13:34:38,239 INFO success: syseepromd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)

Some careful restarting of pmon did help in the end.

root@spine2:130:~# pmon.sh stop

(waaait a bit)

root@spine2:0:~# pmon.sh start
Starting existing pmon container with HWSKU Accton-AS9716-32D

root@spine2:0:~# docker logs pmon -f
...
2024-11-21 13:37:23,996 INFO success: xcvrd entered RUNNING state, process has stayed up for > than 10 seconds (startsecs)
...

root@spine2:0:~# show interface transceiver eeprom Ethernet0
Ethernet0: SFP EEPROM detected
        Active Firmware: N/A
        Active application selected code assigned to host lane 1: N/A

Which build are we running (if any)

SONiC Software Version: SONiC.osso202405.0-439acd33c
SONiC OS Version: 12
Distribution: Debian 12.8
Kernel: 6.1.0-22-2-amd64
Build commit: 439acd33c
Build date: Wed Nov 20 22:41:18 UTC 2024
Built by: [email protected]

Platform: x86_64-accton_as9716_32d-r0
HwSKU: Accton-AS9716-32D
ASIC: broadcom
ASIC Count: 1

Upstream issues/PRs

@wdoekes wdoekes changed the title runtime: Sometimes "Type" in interfaces status (from eeprom) is not populated runtime: (minor) Sometimes "Type" in interfaces status (from eeprom) is not populated Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant