Discussion:
Gnu Assembler - Real Mode - accessing variable from code
(too old to reply)
s***@crayne.org
2005-11-01 01:29:35 UTC
Permalink
Howdy!

I'm a relative newbie to GAS, but I've played a fair bit with MASM.
I'm using GAS in an O/S-less environment. Meaning the elf format is
handy for debug and loading, but we don't have an operating system
setting things up for us. We need to transition the processor from
realmode to protected mode.

As part of my transition from Real mode to Protected, I want to load
the GDT. I'd like to be able to locate the GDT anywhere within the
".data" segment. In MASM, I accomplished this by:

mov ds, 0x9000 ;better match the base addr of the seg where
gdt is.
mov esi, offset gdt_init ;grabs the offset relative to wherever gdt
lives
lgdt ds:[esi] ;builds address 90000.

The same thing does not appear to work within GAS - if I attempt to
grab the "offset", it seems to think I want a Linux based offset. So it
waits until link time, and supplies the symbol's address information
then, as an absolute Address rather than as a relative offset.

This Makes sense since Linux uses flat addressing mode. In most
programs, DS would be set to 0, with a limit of F_FFFF and you can grab
whatever, wherever. So you would logically put the offset 0x90000 into
ESI.

But we're still in real mode, trying to transition. So it doesn't make
sense if that address lives at 0x90000. Setting ESI to 0x90000 returns
way too big of a value so it gpf's since it exceeds the limit.

What I want, is to obtain the relative offset within the data segment
of a variable - and have the preprocessor stuff this into my mov into
esi instruction. I can manage my DS. I can do this by hand supplying
esi, but it seems like there is a more elegant solution. Suggestions?

thanks!


---------
.intel_syntax noprefix
.arch i686

.file "test.s"
.text
.code16 #code16 vs. code16gcc - all targets for call with gcc are
32bit.
.globl _start
_start:

mov ax,0x9000
mov ds,ax
.att_syntax
mov $_gdt_init, %esi #this places 0x90000 into ESI after Link
stage
#happens. Not what we want.
.intel_syntax noprefix

mov esi,0
addr32 lgdt ds:esi # setup the gdt...

mov eax,cr0
or eax, 0x01 ##turn on protected mode (0)
mov cr0,eax # turns the protected mode on

ljmp 0x8, 0 #go to the protected mode segment


.code32
.org 0x2000 #absolute offset 0x80000
protected_region:

mov eax, 5
mov ebx, 6
mov edx, 7

hlt


.data
#don't move the gdt. I couldn't get intersegment referencing to work,
so I need
#it to live at Address 0!
.align 16
_gdt_init:
.word gdt_finish - _gdt_init - 1 #this is the limit of the GDT
table.
.long 0x90000 #this is the location of the GDT
table.
.word 0

#code0 descriptor entry
code0_desc:
#overall base 0x80000
.long 0x0000FFFF #base addr 0000, limit FFFF
.long 0x00CF9908 #base 00, g=1 d=1, limit F, present
#dpl=0,1/type=accessed, base 08

gdt_finish:


---------------------
here's how I'm linking it:

OUTPUT_FORMAT("elf32-i386")
OUTPUT_ARCH(i386)

SECTIONS
{
. = SIZEOF_HEADERS;

.text 0x7e000 : { *.o( .text ) }
.data 0x90000 : { *.o( .data ) }
.bss : { *.o( .bss ) }

/DISCARD/ : { *(.note.GNU-stack .comment) }
}
DJ Delorie
2005-11-01 03:28:05 UTC
Permalink
Post by s***@crayne.org
What I want, is to obtain the relative offset within the data segment
of a variable - and have the preprocessor stuff this into my mov into
esi instruction. I can manage my DS. I can do this by hand supplying
esi, but it seems like there is a more elegant solution. Suggestions?
Well, there's no i386 relocation that means "difference of these two
symbols" or even "offset within section". I think what you can do is
take advantage of the LMA/VMA dichotomy in the linker to have it
create a section that is linked as if it lived at address zero, but is
placed within the image at a non-zero address. It takes a few tricks
to do this (the linker manual documents it as a way to initialize RAM
in an embedded system) but in summary you'll need something like this
snippet from the m32c linker script. You might have to tell the
linker you're building an overlay so it won't complain that the memory
regions overlap.

Alternately, you could just grab the offset of some symbol at the
beginning of the "segment" and do the math at runtime.

I think the PE linker has a section-relative relocation, but that
won't help you if you're using ELF (PE is coff-based).


MEMORY {
REAL (w) : ORIGIN = 0, LENGTH = 0x10000
PROT (w) : ORIGIN = 0, LENGTH = 0xWHATEVER
}

SECTIONS
{
.text :
{
. . .
. = ALIGN(2);
PROVIDE(__rmdatastart = .);
} > PROT =0

.rmdata : {
. = ALIGN(32 / 8);
PROVIDE (__rmdatastart = .);
. . .
PROVIDE (__rmdataend = .);
} > REAL AT>PROT

PROVIDE (__rmdatacopysize = SIZEOF(.data));
s***@crayne.org
2005-11-02 23:08:43 UTC
Permalink
DJ - thanks for the info. I ended up using something similar to
relocate my sections in memory with a linker script.

But it sounds like I can answer my own question. I found a mechanism to
generate a relative offset within a section by using a bit of
assemble/linktime arithmetic. It appears (at least under linux/elf)
that this arithmetic outputs a 32 bit operand, but gets rid of the
absolute portion of the address. Especially handy for segmented
architectures...

mov esi, offset(gdt_init - realmode_start)

I have not figured out how to "cast" this arithmetic into a 16 bit
number. So unfortunately, it doesn't work to use this same arithmetic
for loading a segment descriptor unless you do it indirectly through a
mov to EAX. If you want to ljmp, it doesn't work.

Thanks!
--eric
***@yahoo.com

the code below as a complete example...
-------------------------------------------

#don't put this gdt above 1MB addressable - we're in real mode,
remember?
.text
.code16
.globl _start

realmode_start:
_start:

#locate the GDT...
#place DS at addr 0 so our relo stuff will work.
mov eax,offset realmode_start #this grabs an absolute offset.
shr eax, 4 #make it a segment base.
mov ds,ax

#this finds the relative address of gdt_init.
mov esi, offset(gdt_init - realmode_start)

addr32 lgdt ds:esi # setup the gdt...

mov eax,cr0
or eax, 0x01 ##turn on protected mode (0)
mov cr0,eax # turns the protected mode on

#LJMP CS (pick the flat cs descriptor), EIP - pick the absolute
offset
#we're in real mode, and we default to 16:16. we want 16:32, so I
need to
#get a 0x66 opcode prefix AND I need the longer assembly of
#the instruction.
#data32 as a prefix here seems to accomplish that.
data32 ljmp 0x8, offset main_begin

# ljmp offset(flat_code_desc-gdt_init), offset maintest_begin
#go to the protected mode segment - but errors. 32b in cs?



.align 16
gdt_init:
.word gdt_finish - gdt_init - 1 #this is the limit of the GDT
table.
.long gdt_init #this is the location of the GDT
table.
.word 0 #padding.

#it would be wise to keep this descriptor 1st in the table since I
hardcoded
#the reference to it in the ljmp to go to protected mode.
flat_code_desc:
#base=0, Limit = FFFF_FFFF, expand up, accessed, 4k granularity and
big.
.long 0x0000FFFF
.long 0x00CF9900
flat_data_desc:
#base=0, Limit = FFFF_FFFF, expand up, Writeable, accessed, 4k granula
nd big.
#you can use this for a stack since we don't really care about expand
up/down.
.long 0x0000FFFF
.long 0x00CF9300


gdt_finish:
DJ Delorie
2005-11-03 00:52:19 UTC
Permalink
Post by s***@crayne.org
But it sounds like I can answer my own question. I found a mechanism
to generate a relative offset within a section by using a bit of
assemble/linktime arithmetic. It appears (at least under linux/elf)
that this arithmetic outputs a 32 bit operand, but gets rid of the
absolute portion of the address. Especially handy for segmented
architectures...
mov esi, offset(gdt_init - realmode_start)
This works if both symbols are in the same section in the same asm
file. In that case, gas can do the math locally and no relocations
are needed.

If you ever split this into different source files, it will stop
working.
Post by s***@crayne.org
I have not figured out how to "cast" this arithmetic into a 16 bit
number.
Try just using si instead of esi.
Post by s***@crayne.org
So unfortunately, it doesn't work to use this same arithmetic
for loading a segment descriptor unless you do it indirectly through a
mov to EAX. If you want to ljmp, it doesn't work.
Right, because ljmp needs a segment relocation that gas and ld don't
know about. You have to actually create an OMF executable to get that
relocation.

Loading...