Getting processor information using cpuid instruction and inline assembly

cpuid is a processor specific instruction used to get processor's information and features. In this post we are going to learn how to extract those information using inline assembly in c.

6 min read
Getting processor information using cpuid instruction and inline assembly

cpuid is a processor specific instruction present in intel processors, used to get cpu's details and supported features. Tools like CPU-z or cpuid(linux) can be used to grab those processor's information from GUI or command line but knowing the way to extract those information manually will give you edge, to use it anywhere in your projects or codes. In this article we will discuss how you can call cpuid instruction from inline assemble to extract various cpu information.

Inline Assembly

Let's first get a quick overview of Inline assembly. It is a compiler feature that can be used, when you want to embed some low level assembly code inside your high level programs written in C/C++.

Inside your C/C++ program you can set eax to 0 using following syntax __asm__("mov $0, %eax\n\t");. In this line of code we are representing registers with %  and constant values with $.

The second syntax we care about is from extended inline assembly. Below is a sample code for this.

#include<stdio.h>
int main(){
	int a;
	__asm__("mov $0x4, %eax\n\t");
	__asm__("mov %%eax, %0\n\t":"=r" (a));
	printf("Value in eax register is: %d\n",a);
	return  0;
}

This code snippet first moves decimal value 4 in eax register, then copies that value into variable a and at final print it on stdout. We already know what line no 4 does(moving 4 in eax register), but lets look at line no 5's syntax since it is extended inline assemble.

First thing we have noticed is we are using %% to define a register in extended inline assemble. Then %0 shows that the destination operand is 0th value from left after the : sign. =r is a constraint which tells, this will be the output which we are moving into (a).  Extended inline assembly is little complicated to understand, but we only care about these two syntax's for now. If above explanation is not clear, then take a look about inline assembly here.

CPUID Instruction

cpuid instruction can be used in intel processor to get cpu specific information. Output from cpuid instruction depends on (as well as retrieved using) general purpose registers like eax, ebx. For example with value of eax=1,  cpuid will return processor brand string in registers eax, ebx,ecx and edx. Let's look at detailed view on what all the information we can fetch with cpuid instruction.

EAX = 0

Running cpuid with eax=0 will return the processor's manufacture string and highest function parameters possible. The string get stored in EBX, EDX and ECX register in order. Let's try to extract that string using a simple c program.

int a[3];
__asm__("mov $0x0, %eax\n\t");
__asm__("cpuid\n\t");
__asm__("mov %%ebx, %0\n\t":"=r" (a[0]));
__asm__("mov %%edx, %0\n\t":"=r" (a[1]));
__asm__("mov %%ecx, %0\n\t":"=r" (a[2]));
printf ("%s\n", &a);

In the above program we are first setting eax to 0, running cpuid instruction and then storing the output received from ebx, edx and ecx register in an array a to print it.

Output in eax register will give the highest functionality available which maps following for different processors.

Source: wikipedia

You can get the value return by eax using the following code.

int a[1];
__asm__("mov $0x0, %eax\n\t");
__asm__("cpuid\n\t");
__asm__("mov %%eax, %0\n\t":"=r" (a[0]));
printf ("%x\n", a);

EAX = 1

Using cpuid instruction with eax=1 will give processor features and model related information. Processor model related information including stepping id, model and family information get returned in eax register whereas processor features get returned in ecx, edx and ebx(additional features).

Let's try to extract these informations.

__asm__("mov $0x1 , %eax\n\t");
__asm__("cpuid\n\t");
__asm__("mov %%eax, %0\n\t":"=r" (a[0])); //gives model and family
__asm__("mov %%ebx, %0\n\t":"=r" (a[1])); //gives additional feature info
__asm__("mov %%ecx, %0\n\t":"=r" (a[2])); //feature flags
__asm__("mov %%edx, %0\n\t":"=r" (a[3])); //feature flags

Output return by register is present in decimal, which we need to break down into binary. For that purpose you can use this simple function.

void decToBinary(int n)
{
    // counter for binary array
    int i = 0;
    while (n > 0) {

        // storing remainder in binary array
        binaryNum[i] = n % 2;
        n = n / 2;
        i++;
    }
}

Output binary value for eax register can be mapped using following table.

Source: wikipedia
int outputEax(){
    printf("-------Signature(EAX register):-------");
    printf("\nStepping ID:%d%d%d%d",binaryNum[3],binaryNum[2],binaryNum[1],binaryNum[0]);
    printf("\nModel:%d%d%d%d",binaryNum[7],binaryNum[6],binaryNum[5],binaryNum[4]);
    printf("\nFamily ID:%d%d%d%d",binaryNum[11],binaryNum[10],binaryNum[9],binaryNum[8]);
    printf("\nProcessor Type:%d%d",binaryNum[13],binaryNum[12]);
    printf("\nExtended Model ID:%d%d%d%d",binaryNum[19],binaryNum[18],binaryNum[17],binaryNum[16]);
    printf("\nExtended Family ID:%d%d%d%d%d%d%d%d",binaryNum[27],binaryNum[26],binaryNum[25],binaryNum[24],binaryNum[23],binaryNum[22],binaryNum[21],binaryNum[20]);
    printf("\n");
    return 0;

}

In similar way edx, ecx and ebx register output follow these list.

Source: Wikipedia
Source: Wekipedia

Feature like vmx(virtualization support), pge(page global enable) are extensively used by different tools or even malwares.

Getting Brand String

Brand string returns the processor version detail as string. To extract that, we need to run cpuid instruction with eax input of 8000002H through 80000004H. For each input value, cpuid returns 16 ASCII characters using eax, ebx, ecx, and edx. You can use the following code to extract the brand string.

int a[10];
void brandString(int eaxValues)
{
    if (eaxValues == 1) {
    __asm__("mov $0x80000002 , %eax\n\t");
    }
    else if (eaxValues == 2) {
        __asm__("mov $0x80000003 , %eax\n\t");
    }
    else if (eaxValues == 3) {
        __asm__("mov $0x80000004 , %eax\n\t");
    }
    __asm__("cpuid\n\t");
    __asm__("mov %%eax, %0\n\t":"=r" (a[0]));
    __asm__("mov %%ebx, %0\n\t":"=r" (a[1]));
    __asm__("mov %%ecx, %0\n\t":"=r" (a[2]));
    __asm__("mov %%edx, %0\n\t":"=r" (a[3]));
    printf("%s", &a[0]);
}

void getCpuID()
{
    __asm__("xor %eax , %eax\n\t");
    __asm__("xor %ebx , %ebx\n\t");
    __asm__("xor %ecx , %ecx\n\t");
    __asm__("xor %edx , %edx\n\t");
    printf("Brand string is ");
    brandString(1);
    brandString(2);
    brandString(3);
    printf("\n");
}

int main(){
    getCpuID();
}

Here we are setting the eax value from 0x8000002 to 0x8000004 and printing the output on each cpuid call. The output string on my system is mentioned below.

Getting brand string

EAX = 2

Running cpuid with eax=2 will give cache and TLB descriptor information. The output will be retrieved from eax, ebx, ecx and edx register in any order. Table below shows the list of what each bits stand for.

Source: https://c9x.me/x86/html/file_module_x86_id_45.html

There are many more details cpuid instruction returns but are not very useful. For example eax=7 and ecx=0 will return extended feature. Leaving those informations,  we have covered the most useful parts that you may need to require in the future.  That's it for this post, check out other blogs for more interesting stuff.