Wednesday, June 12, 2024

Smallest hello world in linux assembly

Smallest hello world in linux assembly


From https://jameshfisher.com/2018/03/10/linux-assembly-hello-world/


global _start

section .text

_start:
  mov rax, 1        ; write(
  mov rdi, 1        ;   STDOUT_FILENO,
  mov rsi, msg      ;   "Hello, world!\n",
  mov rdx, msglen   ;   sizeof("Hello, world!\n")
  syscall           ; );

  mov rax, 60       ; exit(
  mov rdi, 0        ;   EXIT_SUCCESS
  syscall           ; );

section .rodata
  msg: db "Hello, world!", 10
  msglen: equ $ - msg


However, when we assemble and link it, it will actually be quite large.


$ nasm -f elf64 -o hello.o hello.s
$ ld -o hello hello.o
$ ./hello
Hello, world!

$ ls -al hello
-rwxrwxr-x 1 d d 8872 Jun 12 08:22 hello


That's because it uses a section .rodata, which requires a read-only memory page, on top of a code page (read-executable). So that requires 8KB (if 4KB pages are used) in your binary.


Still, it's smaller than a "puts("Hello world")" C example program.



#include <stdio.h>

int main()
{
    puts("Hello, world!");
    return 0;
}


Size of C program:


$ gcc test.c
$ ls -al a.out
-rwxrwxr-x 1 d d 15960 Jun 12 08:24 a.out

$ strip a.out
$ ls -al a.out
-rwxrwxr-x 1 d d 14472 Jun 12 08:25 a.out



Going back to the assembly program, to shrink it down even more so, the modified source code is then:


global _start

section .text

_start:
  mov rax, 1        ; write(
  mov rdi, 1        ;   STDOUT_FILENO,
  mov rsi, msg      ;   "Hello, world!\n",
  mov rdx, msglen   ;   sizeof("Hello, world!\n")
  syscall           ; );

  mov rax, 60       ; exit(
  mov rdi, 0        ;   EXIT_SUCCESS
  syscall           ; );

  msg: db "Hello, world!", 10
  msglen: equ $ - msg


The same code, without the section .rodata. It just means the text string "Hello World" is actually in the code (.text section) page.

Next step is to strip the binary -- to make it even slightly more smaller.

This is the best we can do, without hacking the binary any further.


$ ls -al hello
-rwxrwxr-x 1 d d 4360 Jun 12 08:19 hello



So that's 4KB, plus abit more, if you include the ELF headers.


Creating our own ELF headers


Taking from the following articles:

https://www.muppetlabs.com/%7Ebreadbox/software/tiny/teensy.html

https://stackoverflow.com/questions/53382589/smallest-executable-program-x86-64-linux


We create our own ELF header and just build using `nasm -o test test.s`


bits 64
            org 0x08048000

ehdr:                                           ; Elf64_Ehdr
            db  0x7F, "ELF", 2, 1, 1, 0         ;   e_ident
    times 8 db  0
            dw  2                               ;   e_type
            dw  62                              ;   e_machine
            dd  1                               ;   e_version
            dq  _start                          ;   e_entry
            dq  phdr - $$                       ;   e_phoff
            dq  0                               ;   e_shoff
            dd  0                               ;   e_flags
            dw  ehdrsize                        ;   e_ehsize
            dw  phdrsize                        ;   e_phentsize
            dw  1                               ;   e_phnum
            dw  0                               ;   e_shentsize
            dw  0                               ;   e_shnum
            dw  0                               ;   e_shstrndx

ehdrsize    equ $ - ehdr

phdr:                                           ; Elf64_Phdr
            dd  1                               ;   p_type
            dd  5                               ;   p_flags
            dq  0                               ;   p_offset
            dq  $$                              ;   p_vaddr
            dq  $$                              ;   p_paddr
            dq  filesize                        ;   p_filesz
            dq  filesize                        ;   p_memsz
            dq  0x1000                          ;   p_align

phdrsize    equ     $ - phdr

_start:
  mov rax, 1        ; write(
  mov rdi, 1        ;   STDOUT_FILENO,
  mov rsi, msg      ;   "Hello, world!\n",
  mov rdx, msglen   ;   sizeof("Hello, world!\n")
  syscall           ; );

  mov rax, 60       ; exit(
  mov rdi, 0        ;   EXIT_SUCCESS
  syscall           ; );

  msg: db "Hello, world!", 10
  msglen: equ $ - msg

filesize      equ     $ - $$


We build it with nasm:


$ nasm -o test test.s

$ ls -al test
-rwxr-xr-x 1 d d 173 Jun 12 08:35 test



The output is 173 bytes! That's small. But it can be improved, based off those articles above.

Why I think the electronics industry is a waste of resources

Why I think the electronics industry is a waste of resources Growing up with an electrical engineer dad, I was accustomed to seeing PCBs (pr...