Bug 12821 - Regression: sensors: NCT6791D
Summary: Regression: sensors: NCT6791D
Status: CLOSED FIXED
Alias: None
Product: IPFire
Classification: Unclassified
Component: --- (show other bugs)
Version: 2
Hardware: x86_64 Linux
: - Unknown - - Unknown -
Assignee: Arne.F
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-03-28 14:27 UTC by Manfred Knick
Modified: 2022-03-28 16:01 UTC (History)
1 user (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Manfred Knick 2022-03-28 14:27:41 UTC
REGRESSION:
This is to extract info from
https://community.ipfire.org/t/cu163-backup-install-restore-cu165-fresh-install-sensors-nct6775-broken/7468

https://fireinfo.ipfire.org/profile/ebc1539924251781ecb8481a7c9d2691eae5258f

The board provides Chip `Nuvoton NCT6791D Super IO Sensors’,
demanding for module nct6775.

This has been working perfectly since April 2021 / Core Update 157:
https://git.ipfire.org/?p=ipfire-2.x.git;a=commit;h=c85b97ed5a76eb670ffb79d1377ad642f3ca32ec

A)
The "Old install / 2 GB limit" required a fresh install.
After restore, neither “HW Fan” nor “HW Volt” Graph displayed data anymore.

Bewildering, "Hardware Graphs" -> "mbmongraph settings" displayed familiar enties - enabled to no avail.

B)
Fresh clean install of current Cor Update 165 from
https://nightly.ipfire.org/core165/2022-03-23%2017:15:18%20+0000-c55f5c8e/x86_64/
NO import of any backup
Manual fresh config of the whole system

Startup log (console) displays "coretemp nct6775"

# grep -i nct6775 /var/log/messages

Mar 27 17:09:32 scrat kernel: nct6775: Enabling hardware monitor logical device mappings.
Mar 27 17:09:32 scrat kernel: nct6775: Found NCT6791D or compatible chip at 0x2e:0x290
Mar 27 22:00:32 scrat kernel: nct6775: Enabling hardware monitor logical device mappings.
Mar 27 22:00:32 scrat kernel: nct6775: Found NCT6791D or compatible chip at 0x2e:0x290
Mar 27 22:04:10 scrat kernel: nct6775: Enabling hardware monitor logical device mappings.
Mar 27 22:04:10 scrat kernel: nct6775: Found NCT6791D or compatible chip at 0x2e:0x290

Running "sensors-detect" manually (without saving the results):

# sensors-detect version 3.6.0
...
Driver `nct6775':
  * ISA bus, address 0x290
    Chip `Nuvoton NCT6791D Super IO Sensors' (confidence: 9)
...


NO entries for being enabled on "Hardware Graphs" available at all.
Comment 1 Manfred Knick 2022-03-28 14:42:27 UTC
# lsmod | grep -i nct
nct6775                73728  0
hwmon_vid              16384  1 nct6775


# ps aux | grep -i nct
root      9378  0.0  0.0   6504  2356 pts/1    S+   16:41   0:00 grep -i nct


#  ps aux | grep -i hwmon
root      9390  0.0  0.0   6504  2168 pts/1    S+   16:41   0:00 grep -i hwmon
Comment 2 Manfred Knick 2022-03-28 14:43:13 UTC
# sensors

coretemp-isa-0000
Adapter: ISA adapter
Package id 0:  +34.0°C  (high = +79.0°C, crit = +85.0°C)
Core 0:        +34.0°C  (high = +79.0°C, crit = +85.0°C)
Core 1:        +31.0°C  (high = +79.0°C, crit = +85.0°C)

acpitz-acpi-0
Adapter: ACPI interface
temp1:        +27.8°C  (crit = +90.0°C)
temp2:        +29.8°C  (crit = +90.0°C)

iwlwifi_1-virtual-0
Adapter: Virtual device
temp1:        +40.0°C
Comment 3 Manfred Knick 2022-03-28 14:46:58 UTC
Importance:
Esp. for controlling remotely that all fans are still working correctly.
Comment 4 Manfred Knick 2022-03-28 14:52:45 UTC
Checking for conflict:

# rmmod nct6775

# modprobe nct6775

# dmesg
...
[   30.148393] nct6775: Enabling hardware monitor logical device mappings.
[   30.148404] nct6775: Found NCT6791D or compatible chip at 0x2e:0x290
[   30.148408] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\_GPE.HWM) (20210730/utaddress-204)
[   30.148414] ACPI: OSL: Resource conflict; ACPI support missing from driver?
[64750.821678] i2c_dev: i2c /dev entries driver
[67583.354587] nct6775: Found NCT6791D or compatible chip at 0x2e:0x290
[67583.354594] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\_GPE.HWM) (20210730/utaddress-204)
[67583.354599] ACPI: OSL: Resource conflict; ACPI support missing from driver?
[67591.535199] nct6775: Found NCT6791D or compatible chip at 0x2e:0x290
[67591.535206] ACPI Warning: SystemIO range 0x0000000000000295-0x0000000000000296 conflicts with OpRegion 0x0000000000000290-0x0000000000000299 (\_GPE.HWM) (20210730/utaddress-204)
[67591.535211] ACPI: OSL: Resource conflict; ACPI support missing from driver?
Comment 5 Manfred Knick 2022-03-28 15:29:03 UTC
Just remembered a WORKAROUND from "those old days":
	
. . . acpi_enforce_resources=   [ACPI]
. . .          { strict | lax | no }

Adding "acpi_enforce_resources=lax"
at the end of both /boot/grub/grub.cfg entries:
. . . linux /vmlinuz ...

After reboot: "nct6791isa0290*" entries in mbmongraph settings.


"Dirty" -> "A little bit cleaner":

? Where does IPFire store additional linux command-line parameters?
? As usual, in /etc/grub.d/40_custom?

? update /boot/grub? via grub-mkconfig?

? Will that survive the next update / upgrade?
Presumably not:
. . . # grep -i grub /var/ipfire/backup/include
. . . #
Comment 6 Manfred Knick 2022-03-28 15:34:58 UTC
Just discovered:

# cat /etc/default/grub

GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR="$(sed 's, release .*$,,g' /etc/system-release)"
GRUB_DEFAULT=saved
GRUB_CMDLINE_LINUX="panic=10"                    <-----
GRUB_DISABLE_RECOVERY="true"
GRUB_BACKGROUND="/boot/grub/splash.png"
GRUB_GFXMODE="none"


Will it be sufficient to add "acpi_enforce_resources=lax" here?

Will it survive?
Or will it be necessary to re-edit each update?
Comment 7 Michael Tremer 2022-03-28 15:36:35 UTC
Manfred, you really don't need to document your debugging sessions in detail here.

There is no value in flooding everyone's inboxes with lots of messages like this.

I take it, that you have some hardware that simply has a bug. Good that you found a workaround, but I don't think that there is anything here for us to do.
Comment 8 Manfred Knick 2022-03-28 15:58:51 UTC
(In reply to Michael Tremer from comment #7)

> ... but I don't think that there is anything here for us to
> do.

It worked for years.
re-install cu157 / backup restore  broke it.        <-----
Reported on forum.
Done the bug work.

?   And you don't care?
?   Even feel bothered by results?
?   Great.


Let me state this very clear:
. . . I like IPFire,
. . . supported and advocated it.

But an answer of honest concern for IPFire:
This is just one of the elements
indicating that IPfire - for some months now - degrades in reliability:
->   backup and restore - not being tested?
->   $Path problems
->   ... et al.

Would be nice if someone could answer the minor questions left above.
Thanks.
Comment 9 Michael Tremer 2022-03-28 16:01:45 UTC
(In reply to Manfred Knick from comment #8)
> It worked for years.
> re-install cu157 / backup restore  broke it.        <-----
> Reported on forum.
> Done the bug work.
> 
> ?   And you don't care?

No, because there is no bug reported here. If you think the backup restore doesn't work okay here, then please report it as such.

This is a *bug* *tracker*. It *tracks* *bugs*.

You are looking for a conversation with someone and this isn't the place for this.

> But an answer of honest concern for IPFire:
> This is just one of the elements
> indicating that IPfire - for some months now - degrades in reliability:
> ->   backup and restore - not being tested?
> ->   $Path problems
> ->   ... et al.

Again, this is not the place for this.