David's blog
Saturday, September 7, 2024
Friday, September 6, 2024
What is a buffer overflow, anyway? By j00n1x
What is a buffer overflow, anyway? By j00n1x
S0 my f3ll0w pupils, j00 w4nt t0 kn0w wh4t a buff3r 0verfl0w is, am i right? J00 w4nt t0 b3 1337?
Let's follow in the footsteps of someone before us: aleph1's article in phr4ck.
https://phrack.org/issues/49/14.html
Okay? Have you read that? D0 j00 f33l l33t n0w?
Wednesday, July 10, 2024
Why I think the electronics industry is a waste of resources
Why I think the electronics industry is a waste of resources
Growing up with an electrical engineer dad, I was accustomed to seeing PCBs (printed circuit boards) and electronic chips and components. And I loved it, I enjoy electronics and computers very much, which is why I'm a software engineer.
But let's be frank. The electronics industry is a waste of resources. Pen and paper is just fine for everyone.
You know, what happens to the copper on PCBs? They get etched off! They're either lasered off or removed with solution and never seen again. That's copper that they want for EV cars nowadays.
But inevitably, a PCB is more efficient at copper usage than a whole bunch of copper wiring in an ad-hoc wire-wrap circuit.
Of course, I talked about pen and paper before, and writing letters is more inefficient than sending emails, cause it costs petrol to deliver the letters. But CB/Ham radio is more efficient than emails, cause it's just sent over the airwaves, without needing a fibre optic or computer and servers and data centres to run the entire operation.
Morse code telegrams are probably the most resource efficient. They're like CB/Ham radio, but the resources used to create the telegraph machine is cheaper, since all it requires is a coil, however, radio wave interference would reduce its transmission distance, so they would use more electricity.
We need to socialise again. Pen and paper for writing down notes, just like in school, and using our mouths to talk, to give out ideas. They still do this, right?
You know, there's ways to create electronic circuits with conductive ink... imagine drawing a coil on paper, and that acts like an inductor... or two parallel lines, and that actually acts like a capacitor... in fact, circuit diagrams were made for a reason, in that it resembles an actual conductor at work, so conductive ink could be made to work that way.
I want everyone to experience the simpler life. A time where electronics isn't used, and everything can be made from scratch or improvised.
Tennis, anyone? (Wait a second... the machining that it takes to make a racquet, and the rubber used in the tennis ball...)
I know. My new hobby should be gardening. Some of mum's old bamboo shoots are dying/dead, so they need to be replaced.
Drinking tea and coffee. That's my favourite thing to do.
OMG, what is a geek supposed to do if everything is a stupid waste of resources? How do I align myself with the planet? Suggestions anyone?
Wednesday, June 12, 2024
Smallest hello world in linux assembly
Smallest hello world in linux assembly
From https://jameshfisher.com/2018/03/10/linux-assembly-hello-world/
global _start
section .text
_start:
mov rax, 1 ; write(
mov rdi, 1 ; STDOUT_FILENO,
mov rsi, msg ; "Hello, world!\n",
mov rdx, msglen ; sizeof("Hello, world!\n")
syscall ; );
mov rax, 60 ; exit(
mov rdi, 0 ; EXIT_SUCCESS
syscall ; );
section .rodata
msg: db "Hello, world!", 10
msglen: equ $ - msg
However, when we assemble and link it, it will actually be quite large.
$ nasm -f elf64 -o hello.o hello.s
$ ld -o hello hello.o
$ ./hello
Hello, world!
$ ls -al hello
-rwxrwxr-x 1 d d 8872 Jun 12 08:22 hello
That's because it uses a section .rodata, which requires a read-only memory page, on top of a code page (read-executable). So that requires 8KB (if 4KB pages are used) in your binary.
Still, it's smaller than a "puts("Hello world")" C example program.
#include <stdio.h>
int main()
{
puts("Hello, world!");
return 0;
}
Size of C program:
$ gcc test.c
$ ls -al a.out
-rwxrwxr-x 1 d d 15960 Jun 12 08:24 a.out
$ strip a.out
$ ls -al a.out
-rwxrwxr-x 1 d d 14472 Jun 12 08:25 a.out
Going back to the assembly program, to shrink it down even more so, the modified source code is then:
global _start
section .text
_start:
mov rax, 1 ; write(
mov rdi, 1 ; STDOUT_FILENO,
mov rsi, msg ; "Hello, world!\n",
mov rdx, msglen ; sizeof("Hello, world!\n")
syscall ; );
mov rax, 60 ; exit(
mov rdi, 0 ; EXIT_SUCCESS
syscall ; );
msg: db "Hello, world!", 10
msglen: equ $ - msg
The same code, without the section .rodata. It just means the text string "Hello World" is actually in the code (.text section) page.
Next step is to strip the binary -- to make it even slightly more smaller.
This is the best we can do, without hacking the binary any further.
$ ls -al hello
-rwxrwxr-x 1 d d 4360 Jun 12 08:19 hello
So that's 4KB, plus abit more, if you include the ELF headers.
Creating our own ELF headers
Taking from the following articles:
https://www.muppetlabs.com/%7Ebreadbox/software/tiny/teensy.html
https://stackoverflow.com/questions/53382589/smallest-executable-program-x86-64-linux
We create our own ELF header and just build using `nasm -o test test.s`
bits 64
org 0x08048000
ehdr: ; Elf64_Ehdr
db 0x7F, "ELF", 2, 1, 1, 0 ; e_ident
times 8 db 0
dw 2 ; e_type
dw 62 ; e_machine
dd 1 ; e_version
dq _start ; e_entry
dq phdr - $$ ; e_phoff
dq 0 ; e_shoff
dd 0 ; e_flags
dw ehdrsize ; e_ehsize
dw phdrsize ; e_phentsize
dw 1 ; e_phnum
dw 0 ; e_shentsize
dw 0 ; e_shnum
dw 0 ; e_shstrndx
ehdrsize equ $ - ehdr
phdr: ; Elf64_Phdr
dd 1 ; p_type
dd 5 ; p_flags
dq 0 ; p_offset
dq $$ ; p_vaddr
dq $$ ; p_paddr
dq filesize ; p_filesz
dq filesize ; p_memsz
dq 0x1000 ; p_align
phdrsize equ $ - phdr
_start:
mov rax, 1 ; write(
mov rdi, 1 ; STDOUT_FILENO,
mov rsi, msg ; "Hello, world!\n",
mov rdx, msglen ; sizeof("Hello, world!\n")
syscall ; );
mov rax, 60 ; exit(
mov rdi, 0 ; EXIT_SUCCESS
syscall ; );
msg: db "Hello, world!", 10
msglen: equ $ - msg
filesize equ $ - $$
We build it with nasm:
$ nasm -o test test.s
$ ls -al test
-rwxr-xr-x 1 d d 173 Jun 12 08:35 test
The output is 173 bytes! That's small. But it can be improved, based off those articles above.
Friday, May 17, 2024
Hacking 102 - Leaking the canary with strncpy
Hacking 102 - Leaking the canary with strncpy
A convoluted example of how you can leak GCC's stack protector canary with a strncpy, thanks to strncpy not null terminating when the strlen is greater than or equal to the buffer size specified in the 3rd parameter.
get_canary() just verifies it's actually the same as GCC's canary.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
unsigned long get_canary()
{
asm("mov %fs:0x28, %rax");
}
// prints out our string, which is a non-null terminated string, thanks to strncpy's rule of not null terminating!
void print_str(const char *str)
{
for (int i = 0; i < strlen(str); i++) {
printf("%x", (unsigned char) str[i]);
}
printf("\n");
for (int i = 0; i < strlen(str) + 16; i+=8) {
unsigned long *p = (unsigned long *) &str[i];
printf("%lx ", *p);
}
printf("\n");
unsigned long *canary = (unsigned long *) &str[256+8];
printf("Is this the canary getting leaked out? %lx\n", *canary);
printf("Hint: Ignore the last two hex digits, as that's just 'A' character, overwriting the 0x00 null\n");
}
void func(const char *input)
{
char buf[256];
// NB: This is a convoluted strncpy, probably won't find it in the wild, but you never know...
strncpy(buf, input, sizeof(buf)+9); // non-null termination AND off-by-one! -- overwrite with 'A' on the last number of the canary, as it's zero (i.e. little endian 00 aa bb cc dd ee ff 11, so the 00 stops printing it out
// cause it's null, but if we use strncpy with an off-by-one that hits the last digit (first digit little endian) then we can print it out
// also, 8 bytes is just empty part of the stack, so that's why it's 9 bytes. 8 empty bytes, + 1 off-by-one.
print_str(buf);
}
int main()
{
char input[1024];
memset(input, 'A', sizeof(input));
printf("Canary: %lx\n", get_canary());
func(input);
}
Output:
$ gcc test.c
$ ./a.out
Canary: 6af07ba9ba12d800
41414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141414141d812baa97bf06ad0ac548dff7f
4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 4141414141414141 6af07ba9ba12d841 7fff8d54acd0 55ce63ecb3f8 4141414141414141
Is this the canary getting leaked out? 6af07ba9ba12d841
Hint: Ignore the last two hex digits, as that's just 'A' character, overwriting the 0x00 null
*** stack smashing detected ***: terminated
Notice the bold/underline is the canary in reverse, because it's stored little endian on x86_64.
Monday, May 13, 2024
Hacking 101 - stack buffer overflows on x86_64
Hacking 101 - stack buffer overflows on x86_64
I won't explain in too much detail, but the following code creates a buffer overflow that controls the return instruction pointer on the stack, changing the flow of the code into calling func2().
/*
compile: gcc test.c -fno-stack-protector
run: ./a.out 4 (or ./a.out 2... greater than ./a.out 1)
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//#define USE_CANARY
void func2()
{
printf("I did it!\n");
}
void func(const char *input)
{
#ifdef USE_CANARY
long int canary = 0xdeadbeefc001d00d;
#endif
char buf[256];
// strcpy(buf, input);
// cause there's a null in the address...
memcpy(buf, input, 256+8*8);
// buffer overfl0wwwwwwwwwww
printf("buf: %s\n", buf);
fflush(stdout);
#ifdef USE_CANARY
if (canary != 0xdeadbeefc001d00d) {
printf("Buffer overflow detected! Aborting!\n");
abort();
}
#endif
}
int main(int argc, char **argv)
{
char input[1024];
int loop;
// fill up input with 'A' x 256 (leaving 1024-256 bytes left)
memset(input, 'A', 256);
printf("size of func2's pointer (in bytes): %ld\n", sizeof(&func2));
printf("func2's address: %lx\n", (unsigned long) func2);
if (argc < 2) {
loop = 4;
}
else {
loop = atoi(argv[1]);
}
// fill past the 'A's with a pointer to func2(), up to "loop".
for (int i = 0; i < loop; i++) {
unsigned long *p = (unsigned long *) (input + 256 + i * sizeof(&func2));
*p = (unsigned long) func2;
}
printf("input: %s\n", input);
// verify the func2's addresses are actually inside the input buffer.
for (int i = 0; i < loop; i++) {
char *p = input + 256 + i * sizeof(&func2);
unsigned long *q = (unsigned long *) p;
printf("address: %lx\n", *q);
}
// call func with input buffer.
func(input);
}
Output:
gcc test.c -fno-stack-protector
$ ./a.out
size of func2's pointer (in bytes): 8
func2's address: 561e883a51e9
input: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�Q:�V
address: 561e883a51e9
address: 561e883a51e9
address: 561e883a51e9
address: 561e883a51e9
buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA�Q:�V
I did it!
I did it!
I did it!
Segmentation fault (core dumped)
Creating our own Canary
Notice if you compiled with -DUSE_CANARY, it uses our artificial canary, and will abort before the overflow exploits the return instruction pointer.
gcc test.c -fno-stack-protector -DUSE_CANARY
$ ./a.out
size of func2's pointer (in bytes): 8
func2's address: 562a98ea0209
input: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA �*V
address: 562a98ea0209
address: 562a98ea0209
address: 562a98ea0209
address: 562a98ea0209
buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA �*V
Buffer overflow detected! Aborting!
Aborted (core dumped)
Comparing with gcc's canary
Now if you look at a normal compilation, and check out the assembly using `objdump -D ./a.out`
0000000000001223 <func>:
1223: f3 0f 1e fa endbr64
1227: 55 push %rbp
1228: 48 89 e5 mov %rsp,%rbp
122b: 48 81 ec 20 01 00 00 sub $0x120,%rsp
1232: 48 89 bd e8 fe ff ff mov %rdi,-0x118(%rbp)
1239: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
1240: 00 00
1242: 48 89 45 f8 mov %rax,-0x8(%rbp)
1246: 31 c0 xor %eax,%eax
.... more opcodes ....
... followed at the end by ...
12a4: e8 17 fe ff ff call 10c0 <__stack_chk_fail@plt>
12a9: c9 leave
12aa: c3 ret
This is the setting of the canary and the checking of the canary at the end of the function, done by the compiler.
The objdump without stack protector for func is:
<func>:
1203: f3 0f 1e fa endbr64
1207: 55 push %rbp
1208: 48 89 e5 mov %rsp,%rbp
120b: 48 81 ec 10 01 00 00 sub $0x110,%rsp
1212: 48 89 bd f8 fe ff ff mov %rdi,-0x108(%rbp)
You can see the canary's secret value is stored in %fs:0x28, and gets moved to register %rax, which then gets copied onto the frame pointer %rbp - 8 bytes, which is just before the saved frame pointer, and two pointers from the return instruction pointer on the stack.
If you can leak the %fs:0x28 value, you can overwrite the canary with the same value during the overflow, and thus bypass the canary check.
This is the reason why %rax is cleared (xor %rax, %rax), to avoid leaking the value in the %rax register.
Getting the canary secret value from gcc via some means, and then using it
Source code follows:
/*
compile: gcc test.c -fno-stack-protector
run: ./a.out 4 (or ./a.out 2... greater than ./a.out 1)
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
//#define USE_CANARY
unsigned long get_canary()
{
asm("mov %fs:0x28, %rax");
}
void func2()
{
printf("I did it!\n");
}
void func(const char *input)
{
#ifdef USE_CANARY
long int canary = 0xdeadbeefc001d00d;
#endif
char buf[256];
// strcpy(buf, input);
// cause there's a null in the address...
memcpy(buf, input, 256+8*8);
// buffer overfl0wwwwwwwwwww
printf("buf: %s\n", buf);
fflush(stdout);
#ifdef USE_CANARY
if (canary != 0xdeadbeefc001d00d) {
printf("Buffer overflow detected! Aborting!\n");
abort();
}
#endif
}
int main(int argc, char **argv)
{
char input[1024];
int loop;
// fill up input with 'A' x 256 (leaving 1024-256 bytes left)
memset(input, 'A', 256);
printf("size of func2's pointer (in bytes): %ld\n", sizeof(&func2));
printf("func2's address: %lx\n", (unsigned long) func2);
if (argc < 2) {
loop = 2; // loop = 2 is the optimal solution for this case.
}
else {
loop = atoi(argv[1]);
}
unsigned long gcc_canary = get_canary(); // get gcc canary secret value.
// After the 'AAAA' put a gcc_canary for us. The space gcc will provide will be 16 bytes, so we just fill
// it up with 2x canary secret values. (assuming loop = 2)
for (int i = 0; i < loop; i++) {
unsigned long *p = (unsigned long *) (input + 256 + i * sizeof(&func2));
*p = (unsigned long) gcc_canary;
}
// Now put a pointer to func2(), at least twice (assuming loop = 2).
for (int i = loop; i < loop * 2; i++) {
unsigned long *p = (unsigned long *) (input + 256 + i * sizeof(&func2));
*p = (unsigned long) func2;
}
printf("input: %s\n", input);
// verify the func2's addresses are actually inside the input buffer.
for (int i = 0; i < loop * 2; i++) {
char *p = input + 256 + i * sizeof(&func2);
unsigned long *q = (unsigned long *) p;
printf("address or canary: %lx\n", *q);
}
// call func with input buffer.
func(input);
}
Output:
gcc test.c
$ ./a.out
size of func2's pointer (in bytes): 8
func2's address: 55ad7774c21d
input: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
address or canary: ac9d07067ff30500
address or canary: ac9d07067ff30500
address or canary: 55ad7774c21d
address or canary: 55ad7774c21d
buf: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
I did it!
Segmentation fault (core dumped)
Friday, May 3, 2024
Drudget.ai might never launch
Drudget.ai might never launch
https://drudget.ai might never launch, unless it receives funding.
See: https://www.moomoo.com/community/feed/109834449715205
Just to train the AI (neural network) costs $4.6 million USD per iteration, which means if I train it with an algorithm, and the algorithm is wrong, that's $4.6 million that just disappeared from my bank account (due to electricity usages, cloud charges, etc).
And let alone letting everyone play inference on it for free, that would cost heaps in a year, unlike the big players who do that already.
I know I could just fine tune the LLM -- but sometimes they've been "engineeered" to block certain inputs already -- cause they're regarded as "safe LLMs". So I need to train my own LLM from scratch.
So unless funding arrives, I'm afraid https://drudget.ai might never see the light of day.
-
Drudget’s my cybersecurity code auditing with special tools company. It specialises in C/C++ code. https://drudget.com.au
-
How I ended up writing a garbage collector - how passion is the end goal for self-healing code. The story of how I wrote a garbage collector...
-
Drudget.ai might never launch https://drudget.ai might never launch, unless it receives funding. See: https://www.moomoo.com/community/feed...
Basically it falls down to 3 things: 1) a video game using Irrlicht or Unreal (C++), 2) an FPGA based CPU using verilog/vhdl, 3) something cool with SDR (software defined radio).
Let me know if any of those 3 things interests you, by sending me a message on the form at https://www.drudget.com.au