Bug 13155 - _fireinfo.detect_hypervisor() rises Segmentation fault
Summary: _fireinfo.detect_hypervisor() rises Segmentation fault
Status: CLOSED FIXED
Alias: None
Product: Fireinfo
Classification: Unclassified
Component: Client (show other bugs)
Version: unspecified
Hardware: x86_64 Linux
: Will affect almost no one Crash
Assignee: Michael Tremer
QA Contact:
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-21 20:37 UTC by Mauro Condarelli
Modified: 2023-08-08 12:14 UTC (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mauro Condarelli 2023-06-21 20:37:10 UTC
I am testing IPFire in a virtual environment, specifically as a LXD Virtual machine.
This uses QEmu with `accel="kvm"` under the hood.

In this condition `sendprofile` crashes reliably and I find in `/var/log/messages` something like:
```
Jun 21 13:03:05 ipfire kernel: sendprofile[15138]: segfault at 0 ip 000071470f8c465e sp 00007fff7e948ce8 error 4 in libc.so.6[71470f840000+159000] likely on CPU 0 (core 0, socket 0)
Jun 21 13:03:05 ipfire kernel: Code: b6 07 29 c8 c3 0f 1f 80 00 00 00 00 f3 0f 1e fa 89 f8 31 d2 66 0f ef ff 09 f0 25 ff 0f 00 00 3d c0 0f 00 00 0f 8f 74 02 00 00 <f3> 0f 6f 0f f3 0f 6f 06 66 0f 74 c1 66 0f da c1 66 0f ef c9 66 0f
```

Further analysis shows Segfault happens in `/usr/lib/python3.10/site-packages/fireinfo/hypervisor.py` line 26:
```
  			self.__hypervisor = _fireinfo.detect_hypervisor()
```
This is a `C` compiled extension.
I slightly modified the file making it a standalone `C` program (minimal changes) so I could add some printouts; result is:
```c
#include <errno.h>
#include <fcntl.h>
#include <linux/hdreg.h>
#include <stdbool.h>
#include <string.h>
#include <sys/ioctl.h>
#include <stdio.h>
#include <stdint.h>
#include <assert.h>

/* hypervisor vendors */
enum hypervisors {
	HYPER_NONE       = 0,
	HYPER_XEN,
	HYPER_KVM,
	HYPER_MSHV,
	HYPER_VMWARE,
	HYPER_OTHER,
	HYPER_LAST /* for loop - must be last*/
};

const char *hypervisor_ids[] = {
	[HYPER_NONE]    = NULL,
	[HYPER_XEN]     = "XenVMMXenVMM",
	[HYPER_KVM]     = "KVMKVMKVM",
	/* http://msdn.microsoft.com/en-us/library/ff542428.aspx */
	[HYPER_MSHV]    = "Microsoft Hv",
	/* http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458 */
	[HYPER_VMWARE]  = "VMwareVMware",
	[HYPER_OTHER]   = NULL
};

const char *hypervisor_vendors[] = {
	[HYPER_NONE]    = NULL,
	[HYPER_XEN]     = "Xen",
	[HYPER_KVM]     = "KVM",
	[HYPER_MSHV]    = "Microsoft",
	[HYPER_VMWARE]  = "VMWare",
	[HYPER_OTHER]   = "other"
};

#define NEWLINE "\n\r"

static void truncate_nl(char *s) {
	assert(s);

	s[strcspn(s, NEWLINE)] = '\0';
}

static int read_one_line_file(const char *filename, char **line) {
	char t[2048];

	if (!filename || !line)
		return -EINVAL;

	FILE* f = fopen(filename, "re");
	if (!f)
		return -errno;

	if (!fgets(t, sizeof(t), f)) {
		if (ferror(f))
			return errno ? -errno : -EIO;

		t[0] = 0;
	}

	char *c = strdup(t);
	if (!c)
		return -ENOMEM;

	truncate_nl(c);

	*line = c;
	return 0;
}

/*
 * This CPUID leaf returns the information about the hypervisor.
 * EAX : maximum input value for CPUID supported by the hypervisor.
 * EBX, ECX, EDX : Hypervisor vendor ID signature. E.g. VMwareVMware.
 */
#define HYPERVISOR_INFO_LEAF   0x40000000

int detect_hypervisor(int *hypervisor) {
#if defined(__x86_64__) || defined(__i386__)
	/* Try high-level hypervisor sysfs file first: */
	char *hvtype = NULL;
	int r = read_one_line_file("/sys/hypervisor/type", &hvtype);
	if (r >= 0) {
		if (strcmp(hvtype, "xen") == 0) {
			*hypervisor = HYPER_XEN;
			return 1;
		}
	} else if (r != -ENOENT)
		return r;

	/* http://lwn.net/Articles/301888/ */

#if defined(__amd64__)
#define REG_a "rax"
#define REG_b "rbx"
#elif defined(__i386__)
#define REG_a "eax"
#define REG_b "ebx"
#endif

	uint32_t eax = 1;
	uint32_t ecx;
	union {
		uint32_t sig32[3];
		char text[13];
	} sig = {};

	__asm__ __volatile__ (
		/* ebx/rbx is being used for PIC! */
		"  push %%"REG_b"       \n\t"
		"  cpuid                \n\t"
		"  pop %%"REG_b"        \n\t"

		: "=a" (eax), "=c" (ecx)
		: "0" (eax)
	);

	bool has_hypervisor = !!(ecx & 0x80000000U);

	if (has_hypervisor) {
		/* There is a hypervisor, see what it is... */
		eax = 0x40000000U;
		__asm__ __volatile__ (
			"  push %%"REG_b"       \n\t"
			"  cpuid                \n\t"
			"  mov %%ebx, %1        \n\t"
			"  pop %%"REG_b"        \n\t"

			: "=a" (eax), "=r" (sig.sig32[0]), "=c" (sig.sig32[1]), "=d" (sig.sig32[2])
			: "0" (eax)
		);
		sig.text[12] = '\0';

		*hypervisor = HYPER_OTHER;

		if (*sig.text) {
			printf("sig.text: '%s'\n", sig.text);
			for (int id = HYPER_NONE + 1; id < HYPER_LAST; id++) {
				printf("checking: '%s'\n", hypervisor_ids[id]);
				if (strcmp(hypervisor_ids[id], sig.text) == 0) {
					*hypervisor = id;
					break;
				}
			}
		}

		return 1;
	}
#endif
	return 0;
}

int
main() {
	/*
		Get hypervisor from the cpuid command.
	*/
	int hypervisor = HYPER_NONE;

	int r = detect_hypervisor(&hypervisor);
	if (r >= 1) {
		const char* hypervisor_vendor = hypervisor_vendors[hypervisor];
		if (!hypervisor_vendor)
			printf("No hypervisor found\n");
		else
			printf("Hypervisor '%s' found!", hypervisor_vendor);
	}

	return 0;
}
```
Program output is:
```
sig.text: 'Linux KVM Hv'
checking: 'XenVMMXenVMM'
checking: 'KVMKVMKVM'
checking: 'Microsoft Hv'
checking: 'VMwareVMware'
checking: '(null)'
Segmentation fault
```
Fault analysis shows two problematic areas:
# the identification string returned by kernel (`Linux KVM Hv`) is not in the list checked.
  This can happen and should return `HYPER_OTHER` without crashes.
# the last **checked** "string" (`hypervisor_ids[HYPER_OTHER]`) is actually `NULL` ,
  which is passed down to `strcmp()` without further checks and causes `segfault`.
  This is a true bug (off-by-one bug) and can be fixed changing the loop to:
  ```
		for (int id = HYPER_NONE + 1; id < HYPER_OTER; id++) {
                   ...
                }
  ```
  (`HYPER_OTHER` has dual role of "unknown" and sentinel).

Full diff of test program to fix both problematic areas is:
```diff
--- test.c.orig	2023-06-21 15:55:38.616000000 +0000
+++ test.c	2023-06-21 16:13:35.240000000 +0000
@@ -15,8 +15,8 @@
 	HYPER_KVM,
 	HYPER_MSHV,
 	HYPER_VMWARE,
-	HYPER_OTHER,
-	HYPER_LAST /* for loop - must be last*/
+	HYPER_QEMU,
+	HYPER_OTHER     /* for loop - must be last*/
 };
 
 const char *hypervisor_ids[] = {
@@ -27,6 +27,7 @@
 	[HYPER_MSHV]    = "Microsoft Hv",
 	/* http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1009458 */
 	[HYPER_VMWARE]  = "VMwareVMware",
+	[HYPER_QEMU]    = "Linux KVM Hv",
 	[HYPER_OTHER]   = NULL
 };
 
@@ -36,6 +37,7 @@
 	[HYPER_KVM]     = "KVM",
 	[HYPER_MSHV]    = "Microsoft",
 	[HYPER_VMWARE]  = "VMWare",
+	[HYPER_QEMU]    = "QEmu",
 	[HYPER_OTHER]   = "other"
 };
 
@@ -141,7 +143,7 @@
 
 		if (*sig.text) {
 			printf("sig.text: '%s'\n", sig.text);
-			for (int id = HYPER_NONE + 1; id < HYPER_LAST; id++) {
+			for (int id = HYPER_NONE + 1; id < HYPER_OTHER; id++) {
 				printf("checking: '%s'\n", hypervisor_ids[id]);
 				if (strcmp(hypervisor_ids[id], sig.text) == 0) {
 					*hypervisor = id;
```
Please note a (probably more recent) version of this code (no python interface) 
can be found at: https://github.com/systemd/systemd/blob/main/src/basic/virt.c

A bit more context can be found at: https://community.ipfire.org/t/failure-in-first-configuration/9932

I am available for any clarification and testing.

Note: IPFire seems to run fine in spite of this error in my environment.
Comment 1 Michael Tremer 2023-07-01 09:11:12 UTC
http://git.ipfire.org/?p=oddments/fireinfo.git;a=commit;h=f092d0a62749b2802ed4f4bbe78208c790cc6831

Author: Michael Tremer <michael.tremer@ipfire.org>

virt: Fix off-by-one error when detecting hypervisor

Reported-by: Mauro Condarelli <mc5686@mclink.it>
Fixes: #13155 - _fireinfo.detect_hypervisor() rises Segmentation fault
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Comment 2 Michael Tremer 2023-07-01 09:14:52 UTC
Hello Mauro,

thank you for submitting this. I have fixed the off-by-one error here:

> https://git.ipfire.org/?p=oddments/fireinfo.git;a=commitdiff;h=f092d0a62749b2802ed4f4bbe78208c790cc6831

The entire routine is very dated and I am not sure how many hypervisors that have emerged since then are being missed by us. I agree that we should rely on the version from systemd, but that is however not easily accessible through their library.

Currently I do not quite know why we are running fireinfo as it is costing a lot of resources for very little benefit, but that is a different debate.

So I am not sure whether it is a good idea to add LXD being detected as QEMU.

> Note: IPFire seems to run fine in spite of this error in my environment.

Yes, this is mostly a cosmetic issue.
Comment 3 Mauro Condarelli 2023-07-01 12:07:28 UTC
Thanks Michael,
I did not actually check your fix, but at first glance it seems you 
changed the "off-by-one" error in an "off-by-two" one.

My suggested fix is to completely avoid "HYPER_LAST" (unused) and use 
"HYPER_OTHER" in the loop.
Comment 4 Michael Tremer 2023-07-03 09:35:26 UTC
http://git.ipfire.org/?p=oddments/fireinfo.git;a=commit;h=e3e68b9baa9723916b1999394432e9ad260cfaa2

Author: Michael Tremer <michael.tremer@ipfire.org>

virt: Fix off-by-one error when detecting hypervisor

Reported-by: Mauro Condarelli <mc5686@mclink.it>
Fixes: #13155 - _fireinfo.detect_hypervisor() rises Segmentation fault
Signed-off-by: Michael Tremer <michael.tremer@ipfire.org>
Comment 5 Michael Tremer 2023-07-03 09:36:42 UTC
(In reply to Mauro Condarelli from comment #3)
> Thanks Michael,
> I did not actually check your fix, but at first glance it seems you 
> changed the "off-by-one" error in an "off-by-two" one.
> 
> My suggested fix is to completely avoid "HYPER_LAST" (unused) and use 
> "HYPER_OTHER" in the loop.

You are right... I produced absolute bullshit here.

Another attempt:

> https://git.ipfire.org/?p=oddments/fireinfo.git;a=commitdiff;h=e3e68b9baa9723916b1999394432e9ad260cfaa2

The entire code there is quite messy and I would rewrite it entirely if I had the time :)