Assembly HOWTO François-René Rideau rideau@ens.fr v0.4c, 9 February 1997 This is the Linux x86 Assembly HOWTO, aka Free 32-bit x86 Assembly FAQ. This document describes how to program in x86 assembly using only FREE programming tools, and particularly for the Linux/i386 OS. Included material may or may not be applicable to other hardware and/or software platforms. This used to be the Assembly mini-HOWTO. keywords: assembly, assembler, free, macroprocessor, preprocessor, asm, inline asm, 32-bit, x86, i386, gas, as86, nasm 1. INTRODUCTION 1.1. Legal Blurp Copyright (C) 1996, 1997 Francois-Rene Rideau. You can freely distribute this document, provided the original document is pointed to, any modification is clearly indicated as such. 1.2. IMPORTANT NOTE This is still a VERY PRELIMINARY version for this document. You (hey, that's you I'm talking to, so please listen!) are especially invited to ask questions, to answer to questions, to correct given answers, to add new FAQ answers, to give pointers to other software, to insult the current maintainer (me), and TO TAKE OVER THE MAINTENANCE OF THE FAQ in his place (mine), because I have other things to do... For any of these, please contact me rideau@ens.fr Perhaps we can convince Raymond Moon to add a section to his FAQ for comp.lang.asm.x86... ? 1.3. Foreword This document aims at answering frequently asked questions of people who program or want to program 32-bit x86 assembly using free assemblers, particularly under the Linux operating system. It may also point to other documents about non-free, non-x86, or non-32-bit assemblers, though it is not its primary goal. Because the main interest of assembly programming is to build to write the guts of operating systems, languages, and games, where a C compiler fails to provide the needed expressivity (performance is more and more seldom an issue), we stress on development of such software. 1.3.1. How to use this document This document contains answers to some frequently asked questions. At many places, Universal Resource Locators (URL) are given for some software or documentation repository. Please see that the most useful repositories are mirrored, and that by accessing a nearer mirror site, you relieve the whole Internet from unneeded network traffic, while saving your own precious time. Particularly, there are large repositories all over the world, that mirror other popular repositories. You should learn and note what are those places near you (networkwise). Sometimes, the list of mirrors is listed in a file, or in a login message. Please heed the advice. Else, you should ask archie about the software you're looking for... The most recent version for this documents sits in http://www.eleves.ens.fr:8080/home/rideau/Assembly-HOWTO or http://www.eleves.ens.fr:8080/home/rideau/Assembly-HOWTO.sgml but what's in Linux HOWTO repositories should be fairly up to date, too (I can't know): ftp://sunsite.unc.edu/pub/linux/docs/HOWTO/ (?) A french translation of this HOWTO or an earlier version of it might sit around ftp://ftp.ibp.fr/pub/linux/french/HOWTO/ 1.3.2. Other related documents · If you don't know what free software is, please do read carefully the GNU General Public License, which is used in a lot of free software, and a model for most; it generally comes in a file named COPYING, with a library version in a file named COPYING.LIB. Litterature from the FSF (free software foundation) might help you, too. · Particularly, the interesting kind of free software comes with sources that you can consult and correct, or sometimes even borrow from. Read your particular license carefully, and do comply to it. · There is a FAQ for comp.lang.asm.x86 that answers generic questions about x86 assembly programming, and questions about some commercial assemblers in a 16-bit DOS environment. Some of it apply to free 32-bit asm programming, so you may want to read this FAQ... http://www2.dgsys.com/~raymoon/faq/asmfaq.zip · FAQs and docs exist about programming on your favorite platform, whichever it is, that you should consult for platform-specific issues not directly related to programming in assembler. 1.4. History Each version includes a few fixes and minor corrections, which needs not be mentionned each time. Version 0.1 ? ? 1996 I (Faré) create and publish the first mini-HOWTO, 'cause I'm sick answering ever the same questions on comp.lang.asm.x86 Version 0.2 ? ? 1996 ? Version 0.3 ? ? 1996 ? Version 0.3c 15 Jun 1996 ? Version 0.3f 17 Oct 1996 * found -fasm option to enable GCC inline assembler w/o -O optimizations Version 0.3g 2 Nov 1996 * created the History * added pointers in cross-compiling section * added section about I/O programming under Linux (particularly video) Version 0.3h 6 Nov 1996 * more about cross-compiling -- See on sunsite: devel/msdos/ Version 0.3i 16 Nov 1996 * NASM is getting pretty slick Version 0.3j 24 Nov 1996 * point to french version Version 0.3k 19 Dec 1996 * What? I had forgotten to point to terse??? Version 0.3l 11 Jan 1997 ? Version 0.4pre1 13 Jan 1997 text mini-HOWTO transformed into a full sgml HOWTO, to see what the SGML tools are like. Version 0.4 20 Jan 1997 first release of the HOWTO as such. Version 0.4a 20 Jan 1997 * CREDITS section added Version 0.4b 3 Feb 1997 * NASM put before AS86 Version 0.4c 9 Feb 1997 * Added section "DO YOU NEED ASSEMBLY?" 1.5. Credits I would like to thanks the following persons, by order of appearance: · Linus Torvalds for Linux · Bruce Evans for bcc from which as86 is extracted · Janes Faber for his WWW page · Simon Tatham , Julian Hall and the other NASM hackers (for NASM, what else?) · Jim Neil for Terse · Greg Hankins for maintaining HOWTOs · Raymond Moon for his FAQ · Michael Taeschner for pointing me to EMX · KiSung Um for his moral support · Eric Dumas for his translation of the mini-HOWTO into french... · People I've forgot to mention -- please remind me! 2. DO YOU NEED ASSEMBLY? Well, I wouldn't want to interfere with what you're doing, but here are a few advice from hard-earned experience. 2.1. Pros and Cons 2.1.1. The advantages of Assembly Assembly can express very low-level things: · you can access machine-dependent registers and I/O. · you can control the exact behavior of code in critical sections that might involve hardware or I/O lock-ups · you can break the conventions of your usual compiler, which might allow some optimizations (like temporarily breaking rules about GC, threading, etc). · get access to unusual programming modes of your processor (e.g. 16 bit code for startup or BIOS interface on Intel PCs) · you can build interfaces between code fragments using incompatible conventions (e.g. produced by different compilers, or separated by a low-level interface). · you can produce reasonably fast code for tight loops to cope with a bad non-optimizing compiler (but then, there are free optimizing compilers available!) · you can produce hand-optimized code that's perfectly tuned for your particular hardware setup, though not to anyone else's. · you can write some code for your new language's optimizing compiler (that's something few will ever do, and even they, not often). 2.1.2. The disadvantages of Assembly Assembly is a very low-level language (the lowest above hand-coding the binary instruction patterns). This means · it's long and tedious to write initially, · it's very bug-prone, · your bugs will be very difficult to chase, · it's very difficult to understand and modify, i.e. to maintain. · the result is very non-portable to other architectures, existing or future, · your code will be optimized only for a certain implementation of the same architecture (see how optimizing code for the 386, 486, 5x86, K5, Pentium, 6x86, PPro, and whatever, are completely different techniques) · your code might also be unportable accross different OS platforms on the same architecture, by lack of proper tools (well, NASM seems to work or be workable on all intel platforms). · you spend more time on a few details, and can't focus on small and large algorithmic design, that are known to bring the largest part of the speed up. e.g. you might build very fast list manipulation primitives in assembly; only a hash table would have sped up your program much more; or, in another context, a binary tree; or some structure distributed over a cluster of CPUs · a small change in algorithmic design might completely invalidate all your existing assembly code. So that either you're ready (and able) to rewrite it all, or you're tied to a particular algorithmic design 2.1.3. Assessment All in all, you might find that though using assembly is sometimes needed, and might even be useful in a few cases where it is not, you'll want to: · minimize the use of assembly code, · encapsulate this code in well-defined interfaces · have your assembly code automatically generated from patterns expressed in a higher-level language than assembly (from ``macros'' to a high-level language) · have automatic tools translate these programs into assembly code · have this code be optimized if possible Even in cases when Assembly is needed (e.g. OS development), you'll find that not so much of it is, and that the above principles hold. See the sources for Linux (the OS) about it: as few assembly as needed, resulting in a fast, reliable, portable, maintainable OS. Even a successful game like DOOM was massively written in C, with a tiny part only being written in assembly for speed up. 2.2. How to NOT use Assembly 2.2.1. Languages with optimizing compilers For instance, languages like ObjectiveCAML, SML, CommonLISP, Scheme, ADA, Pascal, C, C++, all have free optimizing compilers that'll optimize the bulk of your programs, and often do better than hand- coded assembly even for tight loops, while allowing you to focus on higher-level details, and without forbidding you to grab a few percent of extra performance once you've reached a stable design. Of course, there are also commercial optimizing compilers for most of these languages, too! Some languages have compilers that produce C code, which can be further optimized by a C compiler. LISP, Scheme, Perl, and many other are suches. Speed is fairly good. 2.2.2. General procedure to speed your code up As for speeding code up, you should do it only for parts of a program that a profiling tool has consistently identified as being a performance bottleneck. Hence, if you identify some code portion as being too slow, you should · first try to use a better algorithm; · then try to compile it instead of interpreting it; · then try to enable optimization from your compiler; · then give the compiler hints about how to optimize (typing information in LISP; register usage with GCC; lots of options in most compilers, etc). · then possibly fallback to assembly programming Finally, before you end up writing assembly, you should inspect generated code, to check that the problem really is with bad code generation, as this might really not be the case: compiler-generated code might be better than what you'd have written, particularly on modern pipelined architectures! Slow parts of a program might be intrinsically so. Perhaps a completely different approach to the problem might help, then. 2.2.3. Inspecting compiler-generated code There are many reasons to inspect compiler-generated assembly code. Here are what you'll do with such code: · check whether generated code can be obviously enhanced with hand- coded assembly · when that's the case, start from generated code and modify it instead of starting from scratch · more generally, use generated code as stubs to modify, which at least gets right the way your assembly routines interface to the external world · track down bugs in your compiler (hopefully rarer) The standard way to have assembly code be generated is to invoke your compiler with the -S flag. This works with most Unix compilers, including the GNU C Compiler (GCC), but YMMV. As for GCC, it will produce more understandable assembly code with the -fverbose-asm command-line option. Of course, if you want to get good assembly code, don't forget your usual optimization options and hints! 3. ASSEMBLERS 3.1. GCC Inline Assembly The well-known GNU C/C++ Compiler (GCC), an optimizing 32-bit compiler at the heart of the GNU project, supports the x86 architecture quite well, and includes the ability to insert assembly code in C programs, in such a way that register allocation can be either specified or left to GCC. GCC works on most available platforms, notably Linux, *BSD, VSTa, OS/2, *DOS, Win*, etc. 3.2. Where to find GCC The original GCC site is ftp://prep.ai.mit.edu/pub/gnu/ together with all the released application software from the GNU project. However, there exists a lot of mirrors. However, sources adapted to your favorite OS, and binaries precompiled for it, should be found at your usual FTP sites. For GCC under Linux, see around http://www.linux.org.uk/ For most popular DOS port of GCC is named DJGPP, and can be found in directories of such name in FTP sites. See: http://www.delorie.com/djgpp/ There is also a port of GCC to OS/2 named EMX, that also works under DOS, and includes lots of unix-emulation library routines. See around: http://www.leo.org/pub/comp/os/os2/gnu/emx+gcc/ http://warp.eecs.berkeley.edu/os2/software/shareware/emx.html ftp://ftp-os2.cdrom.com/pub/os2/emx09c/ 3.2.1. Where to find docs for GCC Inline Asm The documentation of GCC includes documentation files in texinfo format, that you can convert to tex, compile (with tex), and print, convert to interactive emacs .info format and browse, convert (with the right tools) to whatever you like, or just read as is. The .info files are generally found on any good installation for GCC. The right section to look for is: C Extensions::Extended Asm:: Section Invoking GCC::Submodel Options::i386 Options:: might help too. Particularly, it gives the i386 specific constraint names for registers: abcdSDB correspond to %eax, %ebx, %ecx, %edx, %esi, %edi, %ebp respectively (no letter for %esp). A URL for this document and section, as converted in HTML format, is http://www.cygnus.com/doc/usegcc_89.html#SEC92 The DJGPP Games resource (not only for game hackers) has this page specifically about assembly: http://www.rt66.com/~brennan/djgpp/djgpp_asm.html Finally, there is a web page called, ``DJGPP Quick ASM Programming Guide'', that covers URLs to FAQs, AT&T x86 ASM Syntax, Some inline ASM information, and converting .obj/.lib files: http://remus.rutgers.edu/~avly/djasm.html GCC depends on GAS for assembling, and follow its syntax (see below); do mind that inline asm needs percent characters to be quoted so they be passed to GAS. See the section about GAS below. Find lots of useful examples in the linux/include/asm-i386/ subdirectory of the sources for the free Linux OS. 3.2.2. Invoking GCC to have it to properly inline assembly code ? Be sure to invoke GCC with the -O flag (or -O2, -O3, etc), to enable optimizations and inline assembly. If you don't, your code may compile, but not run properly!!! Actually (kudos to Tim Potter, timbo@moshpit.air.net.au), it is enough to use the -fasm flag, which is part of all the features enabled by $@ ______________________________________________________________________ 4.2.1. CPP CPP is truely not very expressive, but it's enough for easy things, it's standard, and called transparently by GCC. As an example of its limitations, you can't declare objects so that destructors are automatically called at the end of the declaring block, you can't co-declared data and the code to process it, etc. CPP came with your C compiler. If you could make it without one, don't bother fetching any (though I wonder how you could). GCC (see above) is a free C compiler you could have fetched. 4.2.2. M4 M4 gives you the full power of macroprocessing, with a Turing equivalent language, recursion, regular expressions, etc. You can do with it everything that CPP cannot. See macro4th/This4th from ftp://ftp.forth.org/pub/Forth/ in Reviewed/ ANS/ (?), or the Tunes 0.0.0.25 sources as examples of advanced macroprogramming using m4. However, its fucked up quoting semantics force you to use explicit continuation-passing tail-recursive macro style if you want to do advanced macro programming (which is remindful of TeX -- BTW, has anyone tried to use TeX as a macroprocessor for anything else than typesetting ?). This is NOT worse than CPP that does not allow quoting and recursion anyway. The right version of m4 to get is GNU m4 1.4 (or later if exists), which has the most features and the least bugs or limitations of all. 4.2.3. Macroprocessing with yer own filter You can write your own simple macro-expansion filter with the usual tools: perl, awk, sed, etc. That's quick to do, and you control everything. But of course, any power in macroprocessing must be earned the hard way. 4.2.4. Metaprogramming Instead of using an external filter that expands macros, one way to do things is to write programs that write part or all of other programs. For instance, you could use a program outputing source code to generate sine/cosine/whatever lookup tables, to extract a source-form representation of a binary file, to compile your bitmaps into fast display routines, to extract documentation, initialization/finalization code, description tables, as well as normal code from the same source files, to have customized assembly code, generated from a perl/shell/scheme script that does arbitrary processing, (particularly useful when some kind of data must be mirrored at into many cross-referencing tables and code chunks). etc. Think about it ! 4.2.4.1. Backends from existing compilers Compilers like SML/NJ, Objective CAML, MIT-Scheme, etc, do have their own generic assembler backend, which you might or not want to use, if you intend to generate code semi-automatically from the according languages. 4.2.4.2. The New-Jersey Machine-Code Toolkit There is a project, using the programming language Icon, to build a basis for producing assembly-manipulating code. See around http://www.cs.virginia.edu/~nr/toolkit/ 4.2.4.3. Tunes The Tunes OS project is developping its own assembler as an extension to the Scheme language, as part of its development process. It doesn't run at all yet, though help is welcome. The assembler manipulates symbolic syntax trees, so it could equally serve as the basis for a assembly syntax translator, a disassembler, a common assembler/compiler back-end, etc. Also, the full power of a real language, Scheme, make it unchallenged as for macroprocessing/metaprograming. http://www.eleves.ens.fr:8080/home/rideau/Tunes/ 5. CALLING CONVENTIONS 5.1. Linux 5.1.1. Linking to GCC That's the preferred way. 32-bit arguments are pushed down stack in reverse order (hence accessed/popped in the right order) above the 32-bit near return address. %ebp, %esi, %edi, %ebx are preserved, %eax holds the result, or %edx:%eax for 64-bit results. FP stack: I'm not sure, but I think it's result in st(0), whole stack callee-save. Note that GCC has options to modify the calling conventions by reserving registers, having arguments in registers, not assuming the FPU, etc. Check the i386 info pages. Beware that you must then declare the cdecl attribute for a function that will follow standard GCC calling conventions (I don't know what it does with modified calling conventions). See in the GCC info pages the section: C Extensions::Extended Asm:: 5.1.2. ELF vs a.out problems Some C compilers prepend an underscore before every symbol, while others do not. Particularly, Linux a.out GCC does such prepending, while Linux ELF GCC does not. If you need cope with both behaviors at once, see how existing packages do. For instance, get an old Linux source tree, the Elk, qthreads, or OCAML... You can also override the implicit C->asm renaming by inserting statements like ______________________________________________________________________ void foo asm("bar") (void); ______________________________________________________________________ to be sure that the C function foo will be called really bar in assem­ bly. Note that the utility objcopy, from the binutils package, should allow you to transform your a.out objects into ELF objects, and perhaps the contrary too, in some cases. More generally, it will do lots of file format conversions. 5.1.3. Direct Linux syscalls This is specifically NOT recommended, because it may change, it's not portable, it's a burden to write, it's redundant with the libc effort, AND it precludes fixes and extensions that are made to the libc, like, for instance the zlibc package, that does on-the-fly transparent decompression of gzip-compressed files. The standard, recommended way to call Linux system services is, and will stay, to go through the libc. Shared objects should keep your stuff small. And if you really want smaller binaries, do use #! stuff, with the interpreter having all the overhead you want to keep out of your binaries. Now, if for some reason, you don't want to link to the libc, go get the libc and understand how it works! After all, you're pretending to replace it, ain't you? You might see how linux-eforth-1.0c.tgz does it ftp://ftp.forth.org/pub/Forth/Linux/ The sources for Linux come in handy, too, particularly the asm/unistd.h header file, that describes how to do system calls... Basically, you issue an int $0x80, with the __NR_syscallname number (from asm/unistd.h) in %eax, and parameters (up to five) in %ebx, %ecx, %edx, %esi, %edi respectively. Result is returned in %eax, with a negative result being an error whose opposite is what libc would put in errno. The user-stack is not touched, so you needn't have a valid one when doing a syscall. 5.1.4. I/O under Linux If you want to do direct I/O under Linux, either it's something very simple that needn't OS arbitration, and you should see the IO-Port- Programming mini-HOWTO; or it needs a kernel device driver, and you should try to learn more about kernel hacking, device driver development, kernel modules, etc, for which there are other excellent HOWTOs and documents from the LDP. Particularly, if what you want is Graphics programming, then do join the GGI project: http://synergy.caltech.edu/~ggi/ http://sunserver1.rz.uni-duesseldorf.de/~becka/doc/scrdrv.html Anyway, in all these cases, you'll be better off using GCC inline assembly with the macros from linux/asm/*.h than writing full assembly source files. 5.2. DOS Most DOS extenders come with some interface to DOS services. Read their docs about that, but often, they just simulate int $0x21 and such, so you do ``as if'' you were in real mode (I doubt they have stubs to have things work with 32-bit operands by calling 16-bit DOS services as needed). Docs about DPMI and such can be found on ftp://x2ftp.oulu.fi/pub/msdos/programming/ DJGPP comes with its own (limited) libc replacement, too. It is possible to cross-compile from Linux to DOS, see the devel/msdos/ directory of your local FTP mirror for sunsite.unc.edu 5.3. Winblows and suches Hey, this document covers only free software. Ring me when Winblows becomes free, or when there are free dev tools for it! 5.4. Yer very own OS That's what many asm programmers talk about 5.4.1. Boot loader code & getting into 32-bit mode 5.4.2. The basics about protection 5.4.3. Handling Interrupts 5.4.4. V86/R86 mode for using 16-bit system services. 5.4.5. Defining your object format and calling conventions 5.4.6. Where to find info about it all. Please add pointers to other documents to this section The main source for information is sources of existing OSes. Lots of pointers lie in the following WWW page: http://www.eleves.ens.fr:8080/home/rideau/Tunes/Review/OSes.html Particularly, check Cygnus support's ftp.cygnus.com archive (or a mirror like ftp://sunsite.doc.ic.ac.uk/packages/gnu/cygnus/), or the Flux project (http://ww.cs.utah.edu/projects/flux/), etc. 6. TODO & POINTERS · fill incomplete sections · add more pointers to software... · add simple examples from real life to illustrate the syntax, power, and limitations of each proposed solution. · ask people to help with this HOWTO · find someone who has got some time to takeover the maintenance · perhaps give a few words for assembly on other platforms? · A few pointers · PM FAQ · http://www.fys.ruu.nl/~faber/Amain.html · http://alaska.net/~rrose/assembly.htm · http://www.cera.com · http://www.cit.ac.nz/smac/csware.htm · game programming · And of course, do use your usual Internet Search Tools to look for more information, and tell me anything interesting you find! Authors' .sig: -- , , _ v ~ ^ -- -- Fare -- rideau@clipper.ens.fr -- Francois-Rene Rideau -- +)ang-Vu Ban -- -- ' / . -- Join the TUNES project for a computing system based on computing freedom ! TUNES is a Useful, Not Expedient System WWW page at URL: http://www.eleves.ens.fr:8080/home/rideau/Tunes/