Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doesn't work on Ubuntu 20.04.1 #12

Open
thedanyes opened this issue Oct 9, 2020 · 9 comments
Open

Doesn't work on Ubuntu 20.04.1 #12

thedanyes opened this issue Oct 9, 2020 · 9 comments

Comments

@thedanyes
Copy link

Seems like some kind of problem with the GPU portion? I have Ubuntu 20.04.1 with NVIDIA 450.66, Python 3.8.2.

~/Downloads/sysmon/src$ python3 sysmon.py 

Traceback (most recent call last):
  File "sysmon.py", line 595, in <module>
    main()
  File "sysmon.py", line 589, in main
    main = MainWindow()
  File "sysmon.py", line 217, in __init__
    self.update_gpuinfo()
  File "sysmon.py", line 526, in update_gpuinfo
    num = data[gpu_ind][4]
IndexError: list index out of range

GPU hardware is GTX 780.

Here's a screenshot of nvidia-smi output if that helps:

~/Downloads/sysmon/src$ nvidia-smi
Thu Oct  8 20:29:02 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.66       Driver Version: 450.66       CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce GTX 780     Off  | 00000000:01:00.0 N/A |                  N/A |
| 17%   37C    P8    N/A /  N/A |    795MiB /  3018MiB |     N/A      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

@MatthiasSchinzel
Copy link
Owner

MatthiasSchinzel commented Oct 9, 2020

Not sure yet where to problem is. Could you please post the output of nvidia-smi dmon -c 1

@thedanyes
Copy link
Author

~$ nvidia-smi dmon -c 1
Not supported on the device(s)
Failed to process command line 

@MatthiasSchinzel
Copy link
Owner

MatthiasSchinzel commented Oct 10, 2020

So here we have the problem. I don't know why nvidia-smi failed here, maybe nvidia does not support this functionality for this model.
However there is still hope we can get that information from a different command. Please post the output of nvidia-smi -q. If that does not work I am afraid we have to disable your GPU statistics in the tool in this case. :/

Btw, I think this problem is not depending on Ubuntu 20.04.1.

@aragubas
Copy link

i am having the same issue, so i will post the output here since i can't run it too

https://pastebin.com/raw/AVh2jWdv

@MatthiasSchinzel
Copy link
Owner

MatthiasSchinzel commented Oct 10, 2020

Utilization
    Gpu                         : N/A
    Memory                      : N/A
    Encoder                     : N/A
    Decoder                     : N/A

The only thing we can get in your case is the memory load :(

FB Memory Usage
    Total                       : 962 MiB
    Used                        : 455 MiB
    Free                        : 507 MiB

Maybe better than nothing, but not really satisfying in my opinion. But for the GPU we have to use nvidia-smi.

However, I am still interested in brown-d output, to see if nvidia-smi behaves the same in his case.

@thedanyes
Copy link
Author

~$ nvidia-smi -q

==============NVSMI LOG==============

Timestamp                                 : Sat Oct 10 17:02:53 2020
Driver Version                            : 450.66
CUDA Version                              : 11.0

Attached GPUs                             : 1
GPU 00000000:01:00.0
    Product Name                          : GeForce GTX 780
    Product Brand                         : GeForce
    Display Mode                          : N/A
    Display Active                        : N/A
    Persistence Mode                      : Disabled
    MIG Mode
        Current                           : N/A
        Pending                           : N/A
    Accounting Mode                       : N/A
    Accounting Mode Buffer Size           : N/A
    Driver Model
        Current                           : N/A
        Pending                           : N/A
    Serial Number                         : N/A
    GPU UUID                              : GPU-a230e637-c9e1-1b61-3830-a437b874d88d
    Minor Number                          : 0
    VBIOS Version                         : 80.80.31.00.0D
    MultiGPU Board                        : N/A
    Board ID                              : N/A
    GPU Part Number                       : N/A
    Inforom Version
        Image Version                     : N/A
        OEM Object                        : N/A
        ECC Object                        : N/A
        Power Management Object           : N/A
    GPU Operation Mode
        Current                           : N/A
        Pending                           : N/A
    GPU Virtualization Mode
        Virtualization Mode               : N/A
        Host VGPU Mode                    : N/A
    IBMNPU
        Relaxed Ordering Mode             : N/A
    PCI
        Bus                               : 0x01
        Device                            : 0x00
        Domain                            : 0x0000
        Device Id                         : 0x100410DE
        Bus Id                            : 00000000:01:00.0
        Sub System Id                     : 0x36041458
        GPU Link Info
            PCIe Generation
                Max                       : N/A
                Current                   : N/A
            Link Width
                Max                       : N/A
                Current                   : N/A
        Bridge Chip
            Type                          : N/A
            Firmware                      : N/A
        Replays Since Reset               : 0
        Replay Number Rollovers           : 0
        Tx Throughput                     : N/A
        Rx Throughput                     : N/A
    Fan Speed                             : 17 %
    Performance State                     : P8
    Clocks Throttle Reasons               : N/A
    FB Memory Usage
        Total                             : 3018 MiB
        Used                              : 930 MiB
        Free                              : 2088 MiB
    BAR1 Memory Usage
        Total                             : N/A
        Used                              : N/A
        Free                              : N/A
    Compute Mode                          : Default
    Utilization
        Gpu                               : N/A
        Memory                            : N/A
        Encoder                           : N/A
        Decoder                           : N/A
    Encoder Stats
        Active Sessions                   : N/A
        Average FPS                       : N/A
        Average Latency                   : N/A
    FBC Stats
        Active Sessions                   : N/A
        Average FPS                       : N/A
        Average Latency                   : N/A
    Ecc Mode
        Current                           : N/A
        Pending                           : N/A
    ECC Errors
        Volatile
            Single Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
            Double Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
        Aggregate
            Single Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
            Double Bit            
                Device Memory             : N/A
                Register File             : N/A
                L1 Cache                  : N/A
                L2 Cache                  : N/A
                Texture Memory            : N/A
                Texture Shared            : N/A
                CBU                       : N/A
                Total                     : N/A
    Retired Pages
        Single Bit ECC                    : N/A
        Double Bit ECC                    : N/A
        Pending Page Blacklist            : N/A
    Remapped Rows                         : N/A
    Temperature
        GPU Current Temp                  : 31 C
        GPU Shutdown Temp                 : N/A
        GPU Slowdown Temp                 : N/A
        GPU Max Operating Temp            : N/A
        Memory Current Temp               : N/A
        Memory Max Operating Temp         : N/A
    Power Readings
        Power Management                  : N/A
        Power Draw                        : N/A
        Power Limit                       : N/A
        Default Power Limit               : N/A
        Enforced Power Limit              : N/A
        Min Power Limit                   : N/A
        Max Power Limit                   : N/A
    Clocks
        Graphics                          : N/A
        SM                                : N/A
        Memory                            : N/A
        Video                             : N/A
    Applications Clocks
        Graphics                          : 954 MHz
        Memory                            : 3004 MHz
    Default Applications Clocks
        Graphics                          : 954 MHz
        Memory                            : 3004 MHz
    Max Clocks
        Graphics                          : N/A
        SM                                : N/A
        Memory                            : N/A
        Video                             : N/A
    Max Customer Boost Clocks
        Graphics                          : N/A
    Clock Policy
        Auto Boost                        : N/A
        Auto Boost Default                : N/A
    Processes                             : None

Also I found this in 'man nvidia-smi':

The "nvidia-smi dmon" command-line is used to monitor one or more GPUs (up to 4 devices) plugged into the system. [...] The output data per line is limited by the terminal size. It is supported on Tesla, GRID, Quadro and limited GeForce products for Kepler or newer GPUs.

@MikeyJzak
Copy link

I am also having issues on OS: Ubuntu 20.04.1 LTS x86_64 no nvidia, Lenovo Carbon X1.


python3 sysmon.py
cat: /sys/class/net/wwan0/speed: Invalid argument
Traceback (most recent call last):
File "sysmon.py", line 613, in
main()
File "sysmon.py", line 607, in main
main = MainWindow()
File "sysmon.py", line 102, in init
self.s = sysinfo()
File "/home/mjzak/Downloads/sysmon/src/gather_data.py", line 59, in init
self.get_max_connection_speed()
File "/home/mjzak/Downloads/sysmon/src/gather_data.py", line 267, in get_max_connection_speed
self.max_connection_speed.append(processes[-1])
IndexError: list index out of range

@nate-han
Copy link

Are we any further with this issue, I encountered it also.

@Mekk
Copy link

Mekk commented Oct 14, 2021

I faced the same problem (GeForce GT 710 in my case, nvidia-smi dmon fails with Not supported on the device(s)).

IMHO it doesn't make sense to classify supported and unsupported cards. Simple logic „if nvidia-smi fails, disable GPU stats” would do…

Quick&dirty patch in sysmon.py:

    def update_gpuinfo(self,):
        self.gpuinfo = np.roll(self.gpuinfo, -1, axis=0)
        data = self.s.get_nvidia_smi_info()
        if not len(data[0]) >= 4:
            return

(I added two last lines) and sysmon starts. Of course GeForce tab doesn't show any data. It would be nicer to hide this tab in such a case of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants