assembly,macros,ffmpeg,nasm , Understanding NASM Macro


Understanding NASM Macro

Question:

Tag: assembly,macros,ffmpeg,nasm

I've come around this macro in a assembly source file and I just can't figure out how it's working.

So first I come around this function (hevc_deblock.h):

cglobal hevc_v_loop_filter_chroma_8, 3, 5, 7, pix, stride, tc, pix0, r3stride
    sub            pixq, 2
    lea       r3strideq, [3*strideq]
    mov           pix0q, pixq
    add            pixq, r3strideq
    TRANSPOSE4x8B_LOAD  PASS8ROWS(pix0q, pixq, strideq, r3strideq)
    CHROMA_DEBLOCK_BODY 8
    TRANSPOSE8x4B_STORE PASS8ROWS(pix0q, pixq, strideq, r3strideq)
    RET

So I assume that cglobal seems to do some name mangling so I look it up in the other included files in I find that macro inside the cglobal macro (x86util.asm):

%macro CAT_UNDEF 2
    %undef %1%2
%endmacro

%macro DEFINE_ARGS 0-*
    %ifdef n_arg_names
        %assign %%i 0
        %rep n_arg_names
            CAT_UNDEF arg_name %+ %%i, q
            CAT_UNDEF arg_name %+ %%i, d
            CAT_UNDEF arg_name %+ %%i, w
            CAT_UNDEF arg_name %+ %%i, h
            CAT_UNDEF arg_name %+ %%i, b
            CAT_UNDEF arg_name %+ %%i, m
            CAT_UNDEF arg_name %+ %%i, mp
            CAT_UNDEF arg_name, %%i
            %assign %%i %%i+1
        %endrep
    %endif

    %xdefine %%stack_offset stack_offset
    %undef stack_offset ; so that the current value of stack_offset doesn't get baked in by xdefine
    %assign %%i 0
    %rep %0
        %xdefine %1q r %+ %%i %+ q
        %xdefine %1d r %+ %%i %+ d
        %xdefine %1w r %+ %%i %+ w
        %xdefine %1h r %+ %%i %+ h
        %xdefine %1b r %+ %%i %+ b
        %xdefine %1m r %+ %%i %+ m
        %xdefine %1mp r %+ %%i %+ mp
        CAT_XDEFINE arg_name, %%i, %1
        %assign %%i %%i+1
        %rotate 1
    %endrep
    %xdefine stack_offset %%stack_offset
    %assign n_arg_names %0
%endmacro

It seems to do that name mangling and add the q at the end of arguments. However, I don't understand why there are several lines of %undef directives and only the variable name with the q suffix seems to be used in the function. It also seems to append a number at the end but for some reason I'm not seeing it in the other asm file.

What am I missing here?


Answer:

The DEFINE_ARGS macro defines a number single line macros the are meant to be used to refer to the arguments of the function that the cglobal macro introduces. So for example, if foo is given as the name of the first argument then DEFINE_ARGS creates the following defines:

%xdefine fooq r0q
%xdefine food r0d
%xdefine foow r0w
%xdefine fooh r0h
%xdefine foob r0b
%xdefine foom r0m
%xdefine foomp r0mp

The suffixes represent how the argument is supposed to be accessed. The first five q, d, w, h,, b suffixes indicate the size: pointer (quad-word or double-word), double-word, word, byte, and byte respectively. The h suffix indicates that byte is the high-part of 16-bit value. The m suffix accesses the argument as memory operand of unspecified size while the mp suffixes access it as memory operand of pointer size.

The rNx names that these argument macro get defined as are themselves macros. They expand to the register, or memory location for the m and mp suffixes, where the Nth argument is stored. So when building for 64-bit Windows the macros for the for the first argument are effectively:

%define r0q rcx
%define r0d ecx
%define r0w cx
%define r0h ch
%define r0b cl
%define r0m ecx
%define r0mp rcx

Note that since the Windows 64-bit calling convention passes the first argument in a register (RCX) there's no memory location that corresponds to this argument.

When building for 32-bit targets the the first argument rNx macros end getting defined like this:

%define r0q eax
%define r0d eax
%define r0w ax
%define r0h ah
%define r0b al
%define r0m [esp + stack_size + 4]
%define r0mp dword [esp + stack_size + 4]

The r0q macro in this case only accesses the 32-bit register, because the 64-bit registers aren't accessible in 32-bit code. As this the first argument is passed on the stack when following the 32-bit calling conventions, the prologue code generated by the cglobal macro loads the first argument in to EAX.

Apparently the code that you've seen that uses these argument macros only accesses pointer-sized arguments so that's why you're only seeing q suffixes.

The purpose of the %undef lines at the start of the macro DEFINE_ARGS is to undefines the argument macros the previous invocation of DEFINES_ARGS defined. Otherwise they'd remain defined in the current function. The previous function's argument names are stored in one line macros named arg_nameN.

Please don't follow the example set by the code you're reading. They essentially create a derivative and unique programming language, one that's only really understood by the authors of the macros. It's also not the most efficient way of doing things. If I were writing this code I'd use a C/C++ and its vector intrinsics. That would leave all the differences between 32-bit and 64-bit, Windows and Linux to the compiler, which could generate better code than these macros.


Related:


Unexpected output printf statement [duplicate]


c,macros,printf,ternary-operator
This question already has an answer here: Why is “i” variable getting incremented twice in my program? 8 answers Why outputs of i and j in the following two printf()s are different? #include <cstdio> #define MAX(x,y) (x)>(y)?(x):(y) int main() { int i=10,j=5,k=0; k==MAX(i++,++j); printf("%d %d %d\n",i,j,k); i=10,j=5,k=0; k=MAX(i++,++j); printf("%d...

How to test whether an identifier that is accessible only through a #define macro is defined?


c++,c,macros
I have a #define mapping an identifier (a function name) to a new name like this: #define A doStuff This part I cannot do anything about, I have to access "A" since the actual identifier (doStuff here) may change (and this is not under my control). Now the referenced symbol...

Why is my C code printing out an extra line of rows?


c,loops,for-loop,macros,printf
#include <stdio.h> #define rows 500 //can define rows as any number int main() { int i,j; for(i=0;i<=rows;++i) { for(j=0;j<(2*i+1);++j) { printf("* "); } printf("\n"); } return 0; } So here is my code, what it does is it prints the number of rows set by #define and creates a right...

Counter not working after jumps - assembly language


loops,assembly,counter,increment
For some reason, when i switch to mouse input switch back to keyboard input for my program, increasing and decreasing the counter has no effect. It works perfectly in the first loop where we input characters. Here is the program guys, any advice? look at whatspeed jump for reference after...

defining recursive values C


c,macros
#define BLOCK_OFFSET(block) (BASE_OFFSET+(block-1)*block_size) usage lseek(fd, BLOCK_OFFSET(group->bg_inode_table)+(inode_no-1)*sizeof(struct ext2_inode),SEEK_SET); This is a fragment of a code im trying to understand and I have no idea how that definition works. Is it recursive? Can someone explain step by step whats happening in that lseek?...

How is a file loaded in the FAT file system?


filesystems,fat32,assembly
I am developing a bootloader in x86 Assembly and I have a question about the FAT file system. Which steps are necessary to find and load the file?...

Range of Addresses for a Conditional Branch Instruction in MIPS


assembly,mips
What is the range of addresses for a conditional branch instruction in MIPS with respect to X, where X contains the address of the branch instruction? Assume the current PC value is 0x10000000. What is the range of addresses for jump (j) and the jump and link (jal) in- structions...

Combining multiple template classes to one class using typedef


c++,templates,macros,typedef
I have the following piece of code. Let us have the function declaration and implementation seperate. #include <iostream> class Y1 {}; class Y2 {}; template <class T1, class T2> class A { public: explicit A(); void foo() const; int bar() const; }; template <class T1, class T2> A<T1, T2>::A() {}...

create a macro in sas


macros,sas
I have a report that is generated once a year. each report has the form of the year inside the name - report-2011.xls, report-2012.xls etc. each report contains the following vars: ID, SAL=average monthly salary of that year, Gender (0=male, 1=female), Married (0=not married, 1=married), I need to create a...

NASM assembled bootloader memory issue


assembly,nasm,bootloader
I am writing a bootloader with nasm. At the moment it is designed to output a welcome string, then record keystrokes while displaying them, printing the stored keystrokes upon finding the enter key, and finally halting. bits 16 org 0x7C00 start: jmp main bgetkey: pusha mov ax, 0 mov ah,...

Reserve bytes in stack: x86 Assembly (64 bit)


assembly,x86-64,cpu-registers
pushq %rbp movq %rsp, %rbp subq $32, %rsp I have big question regarding explanation of "$32" in third instruction. The information from search and blogs specifies that in above third instruction we are reserving certain "bytes" of space for stack. From doc, %rsp is 64 bit register and %esp is...

Why do I have to expose a macro implementation's 'use' in the client library?


macros,rust
I'm trying to use a macro I've created in a separate module. With reference to this SO question, I've imported a macro fine. However it seems that I have Update to add macro implementation lib.rs #![macro_use] use std::fmt; use std::ffi::CString; use std::ffi::CStr; use std::str; extern crate libc; pub enum DbaxError...

Wrong answer from DIV assembly


assembly,x86
I have a part of my code mov di,3 mov cx,16 looop: xor dx,dx shl bx,1 adc dx,dx cmp cx,16 je cx16 (it's will dec cx and jump loop) push dx dec cx cmp cx,0 je cx0 mov ax,cx div di cmp dx,0 jne looop when cx = 3 i...

Error: Junk at EOL, first unrecognised character is '('


assembly,gas,quake
I am trying to compile Quake 1 (QW) from source code and have run into the following problem when compiling client/d_copy.s : Error: Junk at EOL, first unrecognised character is '('. The file in question is here (please excuse the syntax highlighting, pastebin only supports NASM). I am attempting to...

Visual Studios building and debugging .cpp file without main()


c++,assembly,visual-studio-2013
Professor just gave us a block of c++ code in order to learn about the debug windows in Visual Studio, however after creating a blank win32 console project and then dragging in the .cpp file I get the missing .exe error message when attempting to debug. I've looked everywhere and...

Asm x86 segmentation fault in reading from file


assembly,x86,segmentation-fault,mmap
I'm receiving segmentation fault in 5th line of loop2. This is the part of my code which is reading values from mmapped file byte by byte. Could you please tell me what I'm doing wrong? .global _start _start: mov $2,%rax mov 16(%rsp),%rdi mov $02,%rsi syscall cmp $0,%rax jl exit 1:...

Print string using INT 0x10 in bootsector


assembly,x86,fasm
I want to create printl function that allow me to print string in the ax register. I am in 16-bit real mode and I can not find any way to print a message. I using int 0x10 to print a single letter. I try pass argument (string to print) in...

How does this code print odd and even?


c,if-statement,macros,logic
#define MACRO(num, str) {\ printf("%d", num);\ printf(" is");\ printf(" %s number", str);\ printf("\n");\ } int main(void) { int num; printf("Enter a number: "); scanf("%d", &num); if (num & 1) { MACRO(num, "Odd"); } else { MACRO(num, "Even"); } return 0; } Please explain the above code (if/else condition and how...

Understanding NASM Macro


assembly,macros,ffmpeg,nasm
I've come around this macro in a assembly source file and I just can't figure out how it's working. So first I come around this function (hevc_deblock.h): cglobal hevc_v_loop_filter_chroma_8, 3, 5, 7, pix, stride, tc, pix0, r3stride sub pixq, 2 lea r3strideq, [3*strideq] mov pix0q, pixq add pixq, r3strideq TRANSPOSE4x8B_LOAD...

storing array from user and accessing it


arrays,assembly,input,user,mips
I have to create an int array based on user input, sum the integers, and output the sum and the array itself. My code calculates the sum correctly, but I cannot output the list. It only prints out the last number and zeros and then gets stuck in an infinite...

Make the input wait for mouse or keyboard - Assembly Language


assembly,input,keyboard,mouse,simultaneous
I fixed my program, but now the loop seems to be stuck. Whenever I press u or d, it is stuck, but the mouse part works :) ; You may customize this and other start-up templates; ; The location of this template is c:\emu8086\inc\0_com_template.txt org 100h CLEAR MACRO MOV AX,0600H...

Append the string provided by __FUNCTION__ macro


c++,macros
I have defined a macro as following: #define ADD_TIME_ENTRY(_name_) m_pTimeMeasurement->addTimeEntry(_name_); Now, I want to pass the function name through ADD_TIME_ENTRY() in whichsoever function I put ADD_TIME_ENTRY() ADD_TIME_ENTRY(__FUNCTION__) works fine for me but now, I want to add "_start" or "_stop" at the end of the function name. I mean, I...

Is this an overflow, or maybe more keyboard data?


assembly,nasm,bootloader
I am writing a bootloader, and it's functionality is basically limited to printing a string, then copying keyboard characters to the screen as they are typed. While writing the routines that read and write the key, I noticed my print routine was not detecting the null terminator in offset (plus)...

C++ / C #define macro calculation


c++,c,macros
Suppose I have #define DETUNE1 sqrt(7)-sqrt(5) #define DETUNE2 sqrt(11)-sqrt(7) And I call these multiple times in my program. Are DETUNE1 and DETUNE2 calculated every time it is called? Thanks. Please don't downvote this, I really want to know and a search didn't turn up anything definite. ...

GCC emits vastly different code using “-march=native” on similar architectures


c,gcc,assembly,sse,avx
I'm working on writing an OpenCL benchmark in C. Currently, it measures the fused multiply-accumulate performance of both a CL device, and the system's processor using C code. The results are then cross checked for accuracy. I wrote the native code to take advantage of GCC's auto vectorizer, and it...

Range of immediate values in ARMv8 A64 assembly


gcc,assembly,arm64
My understanding is that immediate parameters in ARMv8 A64 assembly can be 12 bits long. If that is the case, why does this line of assembly code: AND X12, X10, 0xFEF Produce this error (when compiled with gcc) Error: immediate out of range at operand 3 -- `AND X12, X10,...

Can I create a macro or shortuct for a step of XPath in XQuery?


xml,xpath,macros,xquery
Do we have Macros in XQuery? If yes, could you please give an example of their usage. I have the following code let $x := //price/ancestor::* Can I someway, using macros or other things write it as follows: let $x := //price/outward So, the outward should mean ancestor::*...

How does this instruction look in memory?


assembly,x86
I am having trouble figuring out how this instruction looks in memory for a x86 processor. mov $0x3c,%eax Can someone help me figure it out? For example an easy one is: xor %edi,%edi ---> 0x31 0xFF...

Declaring Variables in the .data Versus on the Stack - ASM


assembly
I'm trying to understand in which cases you would want to declare variables in the .data section of my assembly program and using it for the procedures needed, over instantiating local variables on the stack and vice versa. Is it just that declaring variables in the .data section is only...

Read input into string in Lisp reader macro


macros,common-lisp,reader-macro
I am trying to make a reader macro that would convert @this into "this". This is what I currently have: (defun string-reader (stream char) (declare (ignore char)) (format nil "\"~a\"" (read-line stream t nil t)) ) (set-macro-character #\@ #'string-reader ) The problem is that this requires that I put a...

Use Weka from ImageJ macro: 'path' is required but unset


macros,weka,imagej
I have tried Calling Weka from an ImageJ(Fiji) macro: run("Trainable Weka Segmentation", "open=C:\\input\\test.tif inputfile=C:\input\test.tif path=[Ljava.lang.String;@77e655d8"); But when I run that I get an error message: 'path' is required but unset This line is exactly the line I get when recording a macro and opening test.tif when the Trainable Weka Segmentation...

MinGW's ld cannot perform PE operations on non PE output file


gcc,assembly,mingw,nasm,osdev
I know there are some other similar questions about this out there, be it StackOverflow or not. I've researched a lot for this, and still didn't find a single solution. I'm doing an operative system as a side project. I've been doing all in Assembly, but now I wanna join...

How is shellcode generated from C? - With code example


python,c,gcc,assembly,shellcode
I started taking an online course regarding software security. In one of the sections, I was instructed to execute a hidden C function using a buffer overflow. I got to thinking: what would happen if I could pass machine instruction directly to a stack unsafe executable? What I have been...

Creating a function in compile time using a bitmap like macro


c,macros
I have an ansi C function to sum up values from an array, based on patterns. Something like: long sum_all_according_to_pattern(int n, int *values, int *pattern) { long sum = 0; int i = 0; for(;i<n;i++){ if(pattern[i]) sum+=values[i]; } return sum; } Let's say I've a set of patterns such as:...

How to share same header files between kernel modules and userspace applications.


linux-kernel,macros,linux-device-driver,header-files,ioctl
I want to implement a simple module in which an ioctl() method is used. In the kernel module, I use kernel macros, such as _IO(), _IOWR(), etc., to define my own ioctl sub-commands. In facts, I don't care the actual values of these definitions for that I will always use...

Extract Argument from C Macro


c,macros,arguments,c-preprocessor
I have a number of definitions consisting of two comma-separated expressions, like this: #define PIN_ALARM GPIOC,14 I want to pass the second expression of those definitions (14 in the case above) to unary macros like the following: #define _PIN_MODE_OUTPUT(n) (1U << ((n) * 2U)) How can I extract the second...

NASM: copying a pointer from a register to a buffer in .data


linux,assembly,nasm,x86-64
I am new to asm. I am trying to copy a pointer from a register to a .data variable using NASM, on linux 64-bit. Concider this program: section .data ptr: dq 0 section .text global _start _start: mov [ptr], rsp mov rax, 60 mov rdi, 0 syscall Here I try...

How to jump to an address saved in a register in intel assembly? [duplicate]


assembly,jmp
This question already has an answer here: conditional jump to register 1 answer say I calculated an address of a label and stored it in eax register, how can I JMP(specifically JE) to eax? jmp eax compiles, but I didn't check if it works. je eax doesn't compile(invalid combination...

Split string in macro


c++,string,split,macros
I've a class name with its namespace, like BasicType::MyType. It's possible to create a macro that split the name and uses only the part after the scope operator (I want to create a MyType object in macro without the first part)? EDIT: I'm using a library with different classes in...

ARM assembly cannot use immediate values and ADDS/ADCS together


gcc,assembly,arm,instructions
I am currently trying to speed up some of my C functions on a Cortex-M0 (Freescale KL25Z) using assembly. I get a problem with this minimal test program: @.syntax unified .cpu cortex-m0 .text .global test .code 16 test: mov r0, #0 adds r0, r0, #1 bx lr When I try...

How can I access the individual elements of an array in a loop?


assembly,mips
I need to print the cells of an array, I have an array which contains the word "HELLO_WORLD", I manage to print an index by its own but I can't manage to print all the cells one by one, here is the code : loop: la $t0, hexdigits # address...

Can I create a macro that unrolls loops?


macros,rust
I'm trying to write some fast matrix code in Rust and to do this needs to ensure that loops are unrolled. Is there a way to create a compile-time for-loop? E.g: I want unroll_loop!(f, a, 3); to generate f(a, 0); f(a, 1); f(a, 2); ...

assembly function with C segfault


c,assembly,x86,sse,fpu
I am trying to make assembly function that uses SSE and FPU for parallel calculations. Unfortunately I am receiving segmentation fault(core dumped) error(while debugging it doesn't show in assembly function). I also cannot step out from assembly function. Gdb shows: Warning: Cannot insert breakpoint 0. Cannot access memory at address...

How to control C Macro Precedence


c,macros
#define VAL1CHK 20 #define NUM 1 #define JOIN(A,B,C) A##B##C int x = JOIN(VAL,NUM,CHK); With above code my expectation was int x = 20; But i get compilation error as macro expands to int x = VALNUMCHK; // Which is undefined How to make it so that NUM is replaced first...

How do I check assembly output of Java code?


java,assembly
I found this question that answered it for C++: How do you get assembler output from C/C++ source in gcc?

subl causing Floating point exception


assembly,x86
I am creating a compiler which should compile a fictional language into asm x86 code. When compiling this piece of code (fictional code): int x; int f(int n) { write n; } int main() { x = 1; f(x); } write is equal to just print in console. The desired...

Why don't I get a warning when I declare a variable with same name in a macro that already exists in a function?


c,macros,ansi
Playing with a macro and thought about the following scenario. Declaring variable in a macro which already exists in function from which this macro has been called, why compiler doesn't complain. When I declare in a code a variable it give me a warning: Redefinition of 'temp' I think that...

Macro to push arguments onto stack


visual-c++,assembly,macros
I've been working on a fun little project to mess around with how functions are called and I need a macro to push arguments as it'll be quite time consuming to push the arguments manually for every instance of this obfuscated call. This is my code so far: #define pushargs(...)...