Discussion:
using nasm and segment overrides
(too old to reply)
Rune Kristiansen
2003-11-13 22:55:34 UTC
Permalink
Hi,
I am trying to learn assembly. I am now reading up on segments and how
they work with storing data and jumping to codes etc. I have had
problems with reading data from floppy and then reading the memory
buffer by using DS. It didnt seem to point correctly.
Then i noticed a thing called segment override where I can write
something like:

mov si, ds:[0000h]

What i at least think this is doing is putting my offset address into
SI so that by reading si like, mov ax, [si], i will get the value
located at ds:si (or 2000h:000h.. or similar)... i am really not sure,
but actually its not the point either.

The point is that nasmw doesnt support segment override. Am i correct?
Would using masm be a better choice?

I am using nasmw (0.98.38 win).

Thank you,
Rune
Matt Taylor
2003-11-14 01:51:33 UTC
Permalink
Post by Rune Kristiansen
Hi,
I am trying to learn assembly. I am now reading up on segments and how
they work with storing data and jumping to codes etc. I have had
problems with reading data from floppy and then reading the memory
buffer by using DS. It didnt seem to point correctly.
Then i noticed a thing called segment override where I can write
mov si, ds:[0000h]
What i at least think this is doing is putting my offset address into
SI so that by reading si like, mov ax, [si], i will get the value
located at ds:si (or 2000h:000h.. or similar)... i am really not sure,
but actually its not the point either.
If ds=2000h, then it will load the 16-bit value at 2000:0000 into si. Memory
references usually default to the ds segment, so in most cases it is
redundant. The sp and bp registers are the only exceptions that I can
remember, and they implicitly use ss. When ds=ss, you won't have to worry
about it.
Post by Rune Kristiansen
The point is that nasmw doesnt support segment override. Am i correct?
Would using masm be a better choice?
I am using nasmw (0.98.38 win).
Nasm does support segment overrides. It's picky about the syntax, however.
You have to put the segment inside the brackets like so:

mov al, ds:[si] ; wrong!
mov al, [ds:si] ; correct

I would contest that nasm is a better choice than masm for writing a boot
loader. Nasm is WYSIWIG whereas masm does many things behind your back.
That's my opinion; others may vary. Choose whichever assembler you like
best.

-Matt
Rune Kristiansen
2003-11-14 08:42:50 UTC
Permalink
Post by Matt Taylor
Nasm does support segment overrides. It's picky about the syntax, however.
mov al, ds:[si] ; wrong!
mov al, [ds:si] ; correct
I would contest that nasm is a better choice than masm for writing a boot
loader. Nasm is WYSIWIG whereas masm does many things behind your back.
That's my opinion; others may vary. Choose whichever assembler you like
best.
-Matt
Thanks! I really like nasm for the reasons you point out: masm doing
too much stuff, and I havent been able to assemble raw binaries with
it to date, i just went on using nasm instead.

I tried the mov al, [ds:si] approach and it works, it compiles at
least. I am still not getting the results I want. I think i am
confused about ds and cs. I thought cs was code segment and ds data
segment. But, when I load (in my bootsector) my "os", I put it in
location 1000:0000 and then set ds to point to that location before I
jmp to 1000:0000 to execute the code. I thought you had to use cs
here.. apparently not. But when I later on in my "os" load image data
from floppy to location 2000:0000 I want to point my si to that
location to read the data there while still running the code at
1000:0000. Do you understand my problem? I started reading up on
memory assignment and locations in Randall Hydes art of assembly, but
it only seem to confirm what I know.

I need to read more I think...

Thanks tho :)
Matt Taylor
2003-11-14 14:41:16 UTC
Permalink
Post by Rune Kristiansen
Post by Matt Taylor
Nasm does support segment overrides. It's picky about the syntax, however.
mov al, ds:[si] ; wrong!
mov al, [ds:si] ; correct
I would contest that nasm is a better choice than masm for writing a boot
loader. Nasm is WYSIWIG whereas masm does many things behind your back.
That's my opinion; others may vary. Choose whichever assembler you like
best.
-Matt
Thanks! I really like nasm for the reasons you point out: masm doing
too much stuff, and I havent been able to assemble raw binaries with
it to date, i just went on using nasm instead.
I tried the mov al, [ds:si] approach and it works, it compiles at
least. I am still not getting the results I want. I think i am
confused about ds and cs. I thought cs was code segment and ds data
segment. But, when I load (in my bootsector) my "os", I put it in
location 1000:0000 and then set ds to point to that location before I
jmp to 1000:0000 to execute the code. I thought you had to use cs
here.. apparently not. But when I later on in my "os" load image data
from floppy to location 2000:0000 I want to point my si to that
location to read the data there while still running the code at
1000:0000. Do you understand my problem? I started reading up on
memory assignment and locations in Randall Hydes art of assembly, but
it only seem to confirm what I know.
You were right, ds = data segment and cs = code segment. That is what
they're intended to be used for; however, what they actually point to is up
to you.

What may be happening is, as Jeff pointed out, you may not be initializing
the es segment for the int 13h call. If you post code, perhaps someone can
spot the error.

If you're having tons of problems with the boot loader, I would recommend
using DOS to bootstrap your OS. When I wrote my OS, I never actually wrote a
boot loader because I was not interested in it. I booted DOS and ran a .com
file which simply took over the system. I always intended to write a boot
loader for faster boot times, but never got around to it.

-Matt
Phil Carmody
2003-11-14 16:10:12 UTC
Permalink
Post by Matt Taylor
Nasm does support segment overrides. It's picky about the syntax, however.
mov al, ds:[si] ; wrong!
mov al, [ds:si] ; correct
Was this syntax TASM's "Ideal" syntax?

Phil
--
Unpatched IE vulnerability: window.open search injection
Description: cross-domain scripting, cookie/data/identity
theft, command execution
Reference: http://safecenter.net/liudieyu/WsFakeSrc/WsFakeSrc-Content.HTM
Exploit: http://safecenter.net/liudieyu/WsFakeSrc/WsFakeSrc-MyPage.htm
pH
2003-11-14 04:48:13 UTC
Permalink
Post by Rune Kristiansen
Hi,
I am trying to learn assembly. I am now reading up on segments and how
they work with storing data and jumping to codes etc. I have had
problems with reading data from floppy and then reading the memory
buffer by using DS. It didnt seem to point correctly.
Reading a disk using Int 13h, function 2, ES:BX is expected to point
to the memory area you want the data to go, so... if DS and ES contain
different values, you could--but not neccessarily--have a problem (depending
on what the offset register contained).
Post by Rune Kristiansen
Then i noticed a thing called segment override where I can write
mov si, ds:[0000h]
With "general" data transfers, the DS register is implied, and doesn't
need to be specified. Here, a better example of an override would've
been:

mov si, es:[0000] , or,

mov si, ss:[0000], or,

mov si, cs:[0000], or...

Also, remember that when using "indirect" addressing, via register, that
certain registers imply that a specific segment register will be used...
unless you use a segment override. For example:

mov ax,[bx], means that the 16 bit value at DS:[BX] will be copied
into the AX register. Whereas:

mov ax,[bp+02], will get the value at SS:[BP+02]

stosb will copy the value in AL to ES:DI

movsb copies the 8 bit value at DS:SI to ES:DI.

With the "string" instructions, you can, if need be, override the
source segment, such as:

lods byte ptr cs:[si], but not so with the destination.
Post by Rune Kristiansen
What i at least think this is doing is putting my offset address into
SI so that by reading si like, mov ax, [si], i will get the value
located at ds:si (or 2000h:000h.. or similar)...
Correct.
Post by Rune Kristiansen
i am really not sure,
but actually its not the point either.
You mean I just wasted all that time? ;)
Post by Rune Kristiansen
The point is that nasmw doesnt support segment override. Am i correct?
I don't know a thing about nasm. I assume the "w" means Windows?
All bets are pretty much off, then. For one thing, segment registers
mean something else entirely in 32 bit protected mode.
Post by Rune Kristiansen
Would using masm be a better choice?
Ok... either you're using the wrong assembler for what you're trying
to accomplish/learn, or you're going about <whatever it is you're
wanting to do> the wrong way for the OS you're writing for.

If you're writing for DOS, then... yeah, either masm or (I'm assuming)
nasm. If you're writing for Windows, then you wouldn't, generally speaking,
concern yourself with segment registers or interrupts (I know you didn't
specifically mention "interrupts", but... given your first paragraph, I'm
making an assumption).
Post by Rune Kristiansen
I am using nasmw (0.98.38 win).
win... Well, hopefully someone who knows nasm(w) will jump in, but...
it sounds like yet another case of attempting to start somewhere *other*
than the beginning... or something.
Post by Rune Kristiansen
Thank you,
Rune
Find some good _basic_ examples (source code, that is). Study them.
When you understand everything therein, move on to some intermediate
examples... and etc. When studying sources, have reference material
to... well, refer to. When you encounter something that isn't "clicking",
it would help, *tremendously*, to ask _specific_ questions, provide
_specific_ examples (such as the section of source code itself, whether
yours or the one you're studying).

I hope that doesn't come off as too harsh, or anything, but the phrase,
"trying to learn assembly", followed with--practically in the same breath--
problems related to disk I/O and buffer access with segmented addressing...
To me (and maybe it's *just* me, I dunno), implies a bit of a gap, and of
course there's the question of OS...

Since your interest appears to be Windows, I, for one, cannot recommend
highly *enough*, the following two sites:

http://www.masmforum.com/index.php

http://win32asmboard.cjb.net/

Look for MASM32, which is a complete package, and includes *invaluable*
references, examples, and tutorials.

For the more basic stuff... (addressing modes, for example)... to be honest,
I really don't know where to direct people. I got started long before there
was such a thing as "a web site", and... I just don't know what's out
there, for the beginner (nor do I know what you already have to work
with/learn from). there is, of course, Randy Hyde's very highly regarded
Art of Assembly, which you can find here:

http://webster.cs.ucr.edu

I've never read it though (hence, usually forget to even mention it), and
my background is with MASM, exclusively (a distinction which would
make sense, once you're familiar with Randy's work).

Well anyway... Do some downloading, dive in, and between here and
the boards, you'll find all the help you need.

Very good luck to you! :)

Jeff

http://www.jefftturner.com (may be temporarily unavailable)
Frank Kotler
2003-11-14 08:29:36 UTC
Permalink
Post by Rune Kristiansen
Hi,
I am trying to learn assembly. I am now reading up on segments and how
they work with storing data and jumping to codes etc. I have had
problems with reading data from floppy and then reading the memory
buffer by using DS. It didnt seem to point correctly.
Then i noticed a thing called segment override where I can write
mov si, ds:[0000h]
What i at least think this is doing is putting my offset address into
SI so that by reading si like, mov ax, [si], i will get the value
located at ds:si (or 2000h:000h.. or similar)... i am really not sure,
but actually its not the point either.
The point is that nasmw doesnt support segment override. Am i correct?
No, that's not correct. You wouldn't find an assembler that wouldn't do
segment overrides useful for very long. You don't need a segment
override too often, but when you need it you *need* it! The syntax is
different, however. You can do:

mov ax, [es:1234h]

or:

es mov ax, [1234h]

or even:

es
mov ax, [1234h]

but you *can't* do:

mov ax, es:[1234h]

...like Masm syntax. Nasm doesn't "play nicely with the other
assemblers", to tell the truth.

Just to add to the confusion, Masm syntax is "funny" ("buggy"?) in some
respects. Suppose you were trying to get the length of the command-tail
from PSP:80h...

mov cl, [80h]

Right? Actually, Masm will assemble that as:

mov cl, 80h ; !!! /\/\oo/\/\ ???

You need to use:

mov cl, ds:[80h]

before Masm will do what you want! The "ds:" segment override (3Eh) is
*not* actually emitted into the codestream. Using Nasm, with the "ds:"
inside the brackets, you *would* get that 3Eh emitted into the code, and
you don't really want it. Won't do any harm, but it's redundant - ds: is
the default, anyway.

You *might* want a "ds:" override when using (e)sp or (e)bp - they
default to ss... and ip defaults to cs, as you'd probably figure out.
The "string" instructions use es as the segment for the destination
register, di. This does *not* mean that di defaults to es normally! "mov
ax, [di]" defaults to ds, as usual. Some interrupts want addresses in
es:somereg - I think it's int 10h/13h that wants the address of the
string to print in es:bp, of all places! You usually(?) want to get the
right value in a segment register, rather than use an override.

That segmented memory scheme is *hellacious* to grasp, at first, but
once you "get" it, it's fairly simple and logical - even useful! In
32-bit code, segment registers are involved, too - used differently -
but existing OSen use a "flat" memory model, so you can pretty much
ignore segment registers. You won't miss 'em :)
Post by Rune Kristiansen
Would using masm be a better choice?
That's too complicated a question for me. Sorry. :)
Post by Rune Kristiansen
I am using nasmw (0.98.38 win).
Good. To clarify for non-Nasm users, "nasmw" is the Windows *build* of
Nasm. It *runs* only under Windows, but will emit code for bin, aout,
aoutb, coff (djgpp variant), elf, obj (OMF, 16 and 32 bit), as86, win32
(MS's "coff"), rdf, and ieee. We've got code for mingw (a 3rd variant of
coff, apparently) and "Mach-O" (for NeXTstep, among others) but they're
not "in" yet (anybody needs these, speak up!). Most of these output
formats aren't something you'd *want* under Windows, but you're not
limited to just Windows coding because you're using "nasmw".

If you still can't find that buffer, post the code!

Best,
Frank
Loading...