Are there any good books or tutorials you can recommend which go beyond the very beginner level?
I find reading assembly output from compilers to be a good start, it also helps you develop a mental model of what compilers expect in terms of stack hygiene.
Before godlbolt, https://godbolt.org/ there was `gcc -S` and now there is [3]
gcc -Wa,-adhln -g
for interleaving source and assembly output. Keep the reference manual close and write lots of little experiments to confirm your findings.[1] https://en.wikipedia.org/wiki/GotoBLAS
[2] uses macros and intrinsics https://github.com/oneapi-src/oneDNN
[3] https://stackoverflow.com/questions/3867721/is-there-any-c-c...
https://download-mirror.savannah.gnu.org/releases/pgubook/Pr...
I was able to learn a lot about low level programming. The problems with this book: the examples are in Intel syntax (I find AT&T's syntax better to read and it's more common to use), and not in x86-64.
Also, Hacking: The Art of Exploitation (https://nostarch.com/hacking2.htm) have a nice introduction on Assembly, from the standpoint of a person doing reverse engineering, debugging with GDB or shellcoding.
This might be what you want : https://www.agner.org/optimize/#manuals
That's not so helpful, so more seriously the way that I started was to spend a lot of time reading tutorials and writing sample programs. Back in the day I read virus "magazines" like 40Hex, because they had decent examples of intel assembly and often useful discussion.
These days I've been revisiting things writing a couple of simple compilers:
https://github.com/skx/math-compiler/
The first was mostly written because I'd not done anything recently with floating-point, and the second because compiling brainfuck programs to assembly seemed like it would result in fast programs.
Simple projects like those above could be written quite quickly I think, because they only involve writing a very small collection of "primitives" (such as "write string to STDOUT", or "sin(x)"). They're almost template-based programs.
https://cs.lmu.edu/~ray/notes/nasmtutorial/
x86-64 (aka x64) is less insane than x86-32 so once you get your feet under you it is easier to understand.
https://software.intel.com/sites/landingpage/IntrinsicsGuide...
I don't have AMD, so didn't need to look up AMD specific stuff. Would be amazing if AMD had something like Intel's interactive Intrinsics Guide.
https://www.amazon.com/Hackers-Delight-2nd-Henry-Warren/dp/0...
[1]: https://www.cs.yale.edu/flint/cs422/doc/art-of-asm/pdf/
http://www.s100computers.com/Software%20Folder/6502%20Monito...
> HLA was originally conceived as a tool to teach assembly language programming at the college-university level. The goal is to leverage students' existing programming knowledge when learning assembly language to get them up to speed as fast as possible. Most students taking an assembly language programming course have already been introduced to high-level control flow structures, such as IF, WHILE, FOR, etc. HLA allows students to immediately apply that programming knowledge to assembly language coding early in their course, allowing them to master other prerequisite subjects in assembly before learning how to code low-level forms of these control structures. The book The Art of Assembly Language Programming by Randall Hyde uses HLA for this purpose
Web: https://plantation-productions.com/Webster/
Book: "The Art of Assembly Language Programming" https://plantation-productions.com/Webster/www.artofasm.com/
Portable, Opensource, IA-32, Standard Library: https://sourceforge.net/projects/hla-stdlib/
"12.4 Programming in C/C++ and HLA" in the Linux 32 bit edition: https://plantation-productions.com/Webster/www.artofasm.com/...
... A chapter(s) about wider registers, WASM, and LLVM bitcode etc might be useful?
... Many awesome lists link to OllyDbg and other great resources for ASM; like such as ghidra: https://www.google.com/search?q=ollydbg+site%3Agithub.com+in...
[2] https://www.amazon.com/Practical-Malware-Analysis-Hands-Diss...
Focused on offensive security techniques and concepts but taught me a ton and has both x86 and x86-64 architectures. Briefly touched on ARM as well