Discussion:
How to debug inside the BIOS and/or interrupt?
(too old to reply)
Jim Leonard
2007-07-05 02:45:50 UTC
Permalink
Some of you may remember me with my 808x questions from a while back.
I've put all that into an audio/video player for an old IBM PC (google
"8088 Corruption" for more info). I'm now trying to improve the
performance of the player, and running into some very odd things that
I can't figure out and was hoping someone might have some advice on
how to debug my program.

Background: The basic method of operation of the program is to drive a
Sound Blaster off of IRQ 2 (hard drive is IRQ 5), and the SB interrupt
handler pulls video and audio data out of a queue and displays/sounds
them. In the foreground, the main program is in a loop that
constantly tries to pull data from the hard disk and stuff it into the
queue. If memory fills up, the main program loop pauses. If memory
empties, the interrupt handler is told to "do nothing" until memory is
full with video/audio data again. There is unfortunately some shared
code (memory handling) between the main loop and the interrupt
handler, so I prevent re-entrancy problems by wrapping the shared
pieces in the main disk load loop with CLI/STI.

Currently, only one audio/video chunk is read per disk access, and the
program works. I tried to optimize it today by having the disk
portion read more than one audio/video chunk at a time. This
dramatically improves disk performance, but now the program locks up
after a random period of time seemingly INSIDE the INT 21,3f call to
read data from a file handle. I know this because I stuck various
"printf"s (not really, but you get the idea) before certain sections
of code, and the last thing I can see is the "marker" right before the
disk read.

How can I debug this sort of thing? I've tried to break the operation
of the program when it hangs in both D86 and Turbo Debugger, but being
on the 8088 there's no hardware virtualization and the machine does
not respond to my keypress. How did people debug interrupt handlers,
BIOS functions, etc. "back in the day"?

I can provide full source to anyone who's willing to look at it, if it
will help... It's Turbo Pascal 7.0 with inline assembler and highly
commented. Any advice is appreciated!
Rod Pemberton
2007-07-05 06:17:27 UTC
Permalink
Post by Jim Leonard
Some of you may remember me with my 808x questions from a while back.
I've put all that into an audio/video player for an old IBM PC (google
"8088 Corruption" for more info). I'm now trying to improve the
performance of the player, and running into some very odd things that
I can't figure out and was hoping someone might have some advice on
how to debug my program.
Background: The basic method of operation of the program is to drive a
Sound Blaster off of IRQ 2 (hard drive is IRQ 5), and the SB interrupt
handler pulls video and audio data out of a queue and displays/sounds
them. In the foreground, the main program is in a loop that
constantly tries to pull data from the hard disk and stuff it into the
queue. If memory fills up, the main program loop pauses. If memory
empties, the interrupt handler is told to "do nothing" until memory is
full with video/audio data again. There is unfortunately some shared
code (memory handling) between the main loop and the interrupt
handler, so I prevent re-entrancy problems by wrapping the shared
pieces in the main disk load loop with CLI/STI.
Currently, only one audio/video chunk is read per disk access, and the
program works. I tried to optimize it today by having the disk
portion read more than one audio/video chunk at a time. This
dramatically improves disk performance, but now the program locks up
after a random period of time seemingly INSIDE the INT 21,3f call to
read data from a file handle. I know this because I stuck various
"printf"s (not really, but you get the idea) before certain sections
of code, and the last thing I can see is the "marker" right before the
disk read.
How can I debug this sort of thing? I've tried to break the operation
of the program when it hangs in both D86 and Turbo Debugger, but being
on the 8088 there's no hardware virtualization and the machine does
not respond to my keypress. How did people debug interrupt handlers,
BIOS functions, etc. "back in the day"?
I can provide full source to anyone who's willing to look at it, if it
will help... It's Turbo Pascal 7.0 with inline assembler and highly
commented. Any advice is appreciated!
Is the *entire* IRQ 2 routine wrapped in CLI/STI also?

TSR's usually check the InDOS flag prior to calling DOS to prevent
reentrancy problems. Is it possible that the lockup you're getting is
*somehow* a reentracy problem: IF flag problem, interruptable Pascal code in
IRQ2, etc.?


Rod Pemberton
Benjamin David Lunt
2007-07-05 15:28:00 UTC
Permalink
"Rod Pemberton" <***@crayne.org> wrote in message news:f6i2bg$9t8$***@aioe.org...
Hi Rod, hi guys,
Post by Rod Pemberton
Is the *entire* IRQ 2 routine wrapped in CLI/STI also?
On interrupt, the IF flag is cleared by hardware, and then
on the encounter of the IRET, the pop flags part restores
the IF flag. No need for the CLI/STI pair.

If I were you, I would place an "in ISR flag" in my routine
then allow interrupts to happen. Then at the first of the
routine, if "in ISR flag" is set, simply IRET.

Now that Interrupts are allowed, the IRQ 5 (or the Harddrive
interrupt) can now fire. Is it that DOS is waiting for the
Hard drive interrupt to happen and since you have the IF
flag cleared, it is waiting indefinitely?

Ben
Jim Leonard
2007-07-05 17:07:22 UTC
Permalink
Post by Benjamin David Lunt
If I were you, I would place an "in ISR flag" in my routine
then allow interrupts to happen. Then at the first of the
routine, if "in ISR flag" is set, simply IRET.
Enabling CPU interrupts inside my handler is something I hadn't
thought of before; I will give that a shot and see how it goes.

However, I'm not sure an "in ISR" flag is necessary -- The interrupt
in question is a hardware interrupt. Since it's a hardware interrupt,
isn't another hardware interrupt for the SAME IRQ not possible until I
issue the EOI to the PIC?
Rod Pemberton
2007-07-05 22:32:16 UTC
Permalink
Post by Jim Leonard
However, I'm not sure an "in ISR" flag is necessary -- The interrupt
in question is a hardware interrupt. Since it's a hardware interrupt,
isn't another hardware interrupt for the SAME IRQ not possible until I
issue the EOI to the PIC?
Not sure, it's been a little while now since I did this for my OS.

Only one PIC on an original IBM PC, right?

Is there a default IRQ 2 routine by DOS or the SB drivers? If so, why
aren't you chained to it? Speed? What happens if you try to clear the
specific EOI instead of round-robin?

mov al,62h
out 20h,al

My experience (modern AT) has been that higher IRQ's trigger first and block
all other IRQ's until cleared. But, it's very possible I may have only been
seeing hardware IRQ's at the time... Unfortunately, I was having some
problems with round-robin or non-specific EOI's, like DOS, and switched to
clearing specific EOI's. I couldn't seem to keep the mouse and keyboard
data separate with round-robin EOI's. It was like pending EOI's on occasion
didn't match data: mouse or keyboard. It rarely occurred when only one was
being used. But, it occurred much more frequently when both the mouse and
keyboard where sending data at the same time. Result: lockup.


Rod Pemberton
Jim Leonard
2007-07-06 02:15:15 UTC
Permalink
Post by Rod Pemberton
Only one PIC on an original IBM PC, right?
Yep. IRQ 0 and 1 are reserved (0 is system timer tick and 1 is
keyboard). 2-7 are available.
Post by Rod Pemberton
Is there a default IRQ 2 routine by DOS or the SB drivers? If so, why
Nope. It's completely fly-by-the-seat-of-your-pants :)
Post by Rod Pemberton
My experience (modern AT) has been that higher IRQ's trigger first and block
all other IRQ's until cleared.
If by "higher" you mean "lower IRQ number" then yes, that is what I'm
seeing. For example, IRQ 0, the system timer tick, is the highest
priority and cannot be interrupted while it is running (since all
other IRQs are below it).
Post by Rod Pemberton
Unfortunately, I was having some
problems with round-robin or non-specific EOI's, like DOS, and switched to
clearing specific EOI's.
From what I've read, I don't think that's necessary -- the PIC keeps
track of which hardware interrupt is in service and knows which one is
done when you send the EOI. I could be misinterpreting what I've
read, though.
Markus.Humm
2007-07-10 17:37:06 UTC
Permalink
Hello,

you're true that higher IRQs on ATs block lower ones since the 2nd PIC
is wired to IRQ 2 on the first via IRQ9 on the second.

So it might be a bad practise to put the SB on IRQ2.

Greetings

Markus
Jim Leonard
2007-07-11 16:01:39 UTC
Permalink
Post by Markus.Humm
you're true that higher IRQs on ATs block lower ones since the 2nd PIC
is wired to IRQ 2 on the first via IRQ9 on the second.
So it might be a bad practise to put the SB on IRQ2.
I don't have a choice; this is an 8088 system. Sound Blaster Pro
supports IRQ settings of 2, 5, 7, and 10. My system:

IRQ 0 - System timer
IRQ 1 - Keyboard
IRQ 2 - free
IRQ 3 - Serial port
IRQ 4 - free
IRQ 5 - Hard drive controller
IRQ 6 - Floppy Controller
IRQ 7 - Parallel port

As you can see, IRQ 2 is my only option.

In my specific case, my Sound Blaster on IRQ2 is interrupting my BIOS
INT 13 read (as called by DOS) on IRQ5, and then taking too long or
some other crime, which causes the INT 13 read to never return and
hang the system. Still looking into it :)
Bob Masta
2007-07-12 11:46:55 UTC
Permalink
Post by Jim Leonard
Post by Markus.Humm
you're true that higher IRQs on ATs block lower ones since the 2nd PIC
is wired to IRQ 2 on the first via IRQ9 on the second.
So it might be a bad practise to put the SB on IRQ2.
I don't have a choice; this is an 8088 system. Sound Blaster Pro
IRQ 0 - System timer
IRQ 1 - Keyboard
IRQ 2 - free
IRQ 3 - Serial port
IRQ 4 - free
IRQ 5 - Hard drive controller
IRQ 6 - Floppy Controller
IRQ 7 - Parallel port
As you can see, IRQ 2 is my only option.
In my specific case, my Sound Blaster on IRQ2 is interrupting my BIOS
INT 13 read (as called by DOS) on IRQ5, and then taking too long or
some other crime, which causes the INT 13 read to never return and
hang the system. Still looking into it :)
IRQ 2 shouldn't be a problem as long as you run on PC/XT.
But you can probably use IRQ 7 instead, it it ever becomes
necessary. IIRC it is normally not enabled to do anything
for the printer port, only when specifically activated. (Maybe
by print spooling?) I have used it often for data acquisition
with no difficulties.

But it seems unlikely that the choice of IRQ is causing the
problem you are encountering....

Best regards,


Bob Masta

D A Q A R T A
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Signal Generator
Science with your sound card!
Jason Burgon
2007-07-12 18:28:36 UTC
Permalink
Post by Bob Masta
But it seems unlikely that the choice of IRQ is causing the
problem you are encountering....
Except that IRQ#2 has a higher priority than IRQ#5 and you must therefore
renable all interrupts as I suggested in a previous post, handling
re-entrancy in some way.

--
Jay

Jason Burgon - author of Graphic Vision
http://homepage.ntlworld.com/gvision

Jason Burgon
2007-07-05 22:55:04 UTC
Permalink
Post by Jim Leonard
However, I'm not sure an "in ISR" flag is necessary -- The interrupt
in question is a hardware interrupt. Since it's a hardware interrupt,
isn't another hardware interrupt for the SAME IRQ not possible until I
issue the EOI to the PIC?
Correct, but IRQ priorities on an IBM PC have been assigned somewhat
arbitarily, so it's a good idea IMO to issue an EOI and execute an STI early
in your ISR. This will allow all other interrupts to occur, and not just the
ones that have a higher PIC priority. This also of course also allows your
ISR to re-enter itself, but you can avoid disaster with a counting
semaphore. Depending on what exactly your ISR does, you might want to do
something like the following:

procedure MyISR(var Regs: Registers); assembler;
const
InISR: Integer = 0;
asm
inc [InISR]
sti { Allow higher priority interrupts }
cmp [InISR],1
out [$20],$20 { EOI - Allow all interrupts, incl. this one }
jne @@2 { ISR is re-entered, so exit with InISR + 1, EOI & STI }
push ax
push bx
push ...etc
push ds
mov ax,Seg @DATA
push es
mov ds,ax
@@1:
call DoISROperation
lock; dec [InISR]
jnz @@1 { Repeat for every re-entry }
pop es
pop ds
pop ...etc
pop bx
pop ax
@@2:
rti
end;

The above (written off the top of my head) has two advantages:

(1) It doesn't block any other ISR's, even lower priority ones such as the
IRQ5 which you say your hard drive is using.

(2) It allows re-entrancy without causing any damage and without missing any
"DoISROperation"s. The jitter might be bad, but often this either doesn't
matter, or bad jitter is much better than completely missing an ISR
operation altogether.

The above assumes that the hardware in question will generate another IRQ
without you reading or writing any of its registers. If it doesn't then the
semaphore isn't needed, or you need to re-enable its IRQ firing mechanism
even when it's just a re-entry.
Jim Leonard
2007-07-06 16:39:13 UTC
Permalink
Post by Jason Burgon
(2) It allows re-entrancy without causing any damage and without missing any
"DoISROperation"s. The jitter might be bad, but often this either doesn't
matter, or bad jitter is much better than completely missing an ISR
operation altogether.
That is a very interesting idea (counting the number of misses, then
executing the ISR n times), one that I hadn't considered until now.
Thanks for the idea! I'll see what happens to the Sound Blaster if it
misses an IRQ. Since the SB IRQ firing frequency is guaranteed to be
60Hz or less (due to the nature of my program), the jitter should be
acceptable.
Jim Leonard
2007-07-05 16:04:20 UTC
Permalink
Post by Rod Pemberton
Is the *entire* IRQ 2 routine wrapped in CLI/STI also?
No, it's not. Should it be? The interrupt handler is called by a
hardware interrupt, and I was under the impression that, on entry to
an interrupt handler, processor interrupts are disabled as if CLI were
called. At the end of my handler is an EOI (to tell the PIC the
hardware interrupt has been serviced), and then IRET, which I thought
set the interrupt flag again. Am I misunderstanding something?
Post by Rod Pemberton
TSR's usually check the InDOS flag prior to calling DOS to prevent
reentrancy problems. Is it possible that the lockup you're getting is
*somehow* a reentracy problem: IF flag problem, interruptable Pascal code in
IRQ2, etc.?
The interrupt handler doesn't call BIOS or DOS functions, and the only
shared code it calls, I wrap CLI/STI around in the main program so
that the interrupt can't be running when I call said code.
Rod Pemberton
2007-07-05 22:33:55 UTC
Permalink
Post by Jim Leonard
Post by Rod Pemberton
Is the *entire* IRQ 2 routine wrapped in CLI/STI also?
No, it's not. Should it be? The interrupt handler is called by a
hardware interrupt, and I was under the impression that, on entry to
an interrupt handler, processor interrupts are disabled as if CLI were
called. At the end of my handler is an EOI (to tell the PIC the
hardware interrupt has been serviced), and then IRET, which I thought
set the interrupt flag again. Am I misunderstanding something?
No.

Sorry, my mistake. I should have asked if interrupts were disabled (IF
cleared) for the entire routine. I wrapper my IRQ routines with CLI/STI
because they go through a trap gate. This shouldn't affect you. I was
considering that interrupts might be enabled for part of the IRQ routine,
and that it might be a problem.

Anyway, Ben thinks it might work better with interrupts enabled. It's worth
a shot. I usually try all combinations anyway. :-) Helps to learn new
stuff - even if painful...

The problem I experienced with my personal OS was corruption of static or
shared data. Since my IRQ routines were becoming quite large, I enabled
interrupts for non-critical portions of the routines. I wanted to find out
if the OS became more responsive or less so. It seemed to be slightly more
responsive. So, I left it. Unfortunately, there was one pointer which
absolutely needed to remain unchanged until each IRQ routine exited. Of
course, allowing interrupts meant that the pointer was overwritten with a
new value with every interrupt. Since the data pointed to at that time was
non-critical, it appeared that things were working properly, but it wasn't.


Rod Pemberton
Wolfgang Kern
2007-07-06 15:40:44 UTC
Permalink
Rod Pemberton wrote:

[..]
Post by Rod Pemberton
Sorry, my mistake. I should have asked if interrupts were disabled (IF
cleared) for the entire routine. I wrapper my IRQ routines with CLI/STI
because they go through a trap gate. This shouldn't affect you. I was
considering that interrupts might be enabled for part of the IRQ routine,
and that it might be a problem.
Anyway, Ben thinks it might work better with interrupts enabled.
It's worth a shot. I usually try all combinations anyway. :-)
Helps to learn new stuff - even if painful...
The problem I experienced with my personal OS was corruption of static or
shared data. Since my IRQ routines were becoming quite large, I enabled
interrupts for non-critical portions of the routines.
[...]

My first OS attempts encountered similar effects. So I decedied to split
the story into shortest HW-related 'IRQ-routines' and 'event-handlers'.
The main idle in my OS just polls two 32-bit event-flags variables and
the main timecounter for timesliced tasks.
__________
MAIN_RESET: ;at start and after fatal errors
cli
LGDT... ;initialise all what's needed
...
MAIN:
sti
mov eax,[hw-events]
and eax,[hw-mask] ;may ignore ie: IRQ09 VGA-retrace
jz L0
call act_event
L0:
mov eax,[time_events] ;software and hardware time-outs
and eax,[sw-mask]
jz L1
call act_timeout
L1:
mov eax,[main_time]
cmp eax,[next_sceduled]
jc main
call next_task
jmp main
________

The IRQ00 routine only increments one main counter and decrement a set
of timeout-counters until zero (set TimeOut# event bit on 1 -> 0 only).
And I the added the 0...999 mSec counter synchronised by RTCL (IRQ8).

My 'largest' IRQ-handlers are mouse and keyboard, because this devices
fire several IRQs in certain sequences, the event-bits are only set
when the sequence is completed. I enable IRQ very early in this two, but
never encountered any problems by using 'normal' EOI (out[A0/20],20)
even KEYBD(IRQ01) and PS/2 mouse(IRQ0C) share I/O-ports.

With 'largest' I mean larger than 64 bytes, but not more than 256,
whereas (depending on the sequence count) again maximal 64 bytes per
IRQ will be executed.

So the only possibile reentrance could occure if the KEYBD-IRQ would
handle KEYBD-LEDs on it's own, because LED-setting needs 'some' time
and a fast typist can hit two more keys meanwhile.
I experienced this as buggy behaviour and finally made CAPS,NUM,SCROLL-
LED-actions apart from IRQ-handler (event driven now).

__
wolfgang
Bob Masta
2007-07-05 12:15:06 UTC
Permalink
Post by Jim Leonard
Some of you may remember me with my 808x questions from a while back.
I've put all that into an audio/video player for an old IBM PC (google
"8088 Corruption" for more info). I'm now trying to improve the
performance of the player, and running into some very odd things that
I can't figure out and was hoping someone might have some advice on
how to debug my program.
Background: The basic method of operation of the program is to drive a
Sound Blaster off of IRQ 2 (hard drive is IRQ 5), and the SB interrupt
handler pulls video and audio data out of a queue and displays/sounds
them. In the foreground, the main program is in a loop that
constantly tries to pull data from the hard disk and stuff it into the
queue. If memory fills up, the main program loop pauses. If memory
empties, the interrupt handler is told to "do nothing" until memory is
full with video/audio data again. There is unfortunately some shared
code (memory handling) between the main loop and the interrupt
handler, so I prevent re-entrancy problems by wrapping the shared
pieces in the main disk load loop with CLI/STI.
Currently, only one audio/video chunk is read per disk access, and the
program works. I tried to optimize it today by having the disk
portion read more than one audio/video chunk at a time. This
dramatically improves disk performance, but now the program locks up
after a random period of time seemingly INSIDE the INT 21,3f call to
read data from a file handle. I know this because I stuck various
"printf"s (not really, but you get the idea) before certain sections
of code, and the last thing I can see is the "marker" right before the
disk read.
How can I debug this sort of thing? I've tried to break the operation
of the program when it hangs in both D86 and Turbo Debugger, but being
on the 8088 there's no hardware virtualization and the machine does
not respond to my keypress. How did people debug interrupt handlers,
BIOS functions, etc. "back in the day"?
I can provide full source to anyone who's willing to look at it, if it
will help... It's Turbo Pascal 7.0 with inline assembler and highly
commented. Any advice is appreciated!
I don't have an answer to your problem, but I am curious about
your statement that the interrupt handler is told to "do nothing"
until data is again available. SB data is normally transferred via
DMA (except for a crappy old 8-bit mode), and the interrupt is
just to tell you that the DMA block is done. I don't recall how
the SB interacts with the DMA if you don't acknowledge the
interrupt to the SB... it might stall, or it might just loop on the
old contents of the DMA buffer. Is it possible that the
"do nothing" is causing the SB and the motherboard DMA
controller to get out of sync somehow? Still, I wouldn't
think this would lock up, just output garbage data to the SB.

Just a thought...

Bob Masta

D A Q A R T A
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Signal Generator
Science with your sound card!
Jim Leonard
2007-07-05 16:13:45 UTC
Permalink
Post by Bob Masta
I don't have an answer to your problem, but I am curious about
your statement that the interrupt handler is told to "do nothing"
until data is again available. SB data is normally transferred via
DMA (except for a crappy old 8-bit mode), and the interrupt is
just to tell you that the DMA block is done. I don't recall how
the SB interacts with the DMA if you don't acknowledge the
interrupt to the SB... it might stall, or it might just loop on the
old contents of the DMA buffer. Is it possible that the
"do nothing" is causing the SB and the motherboard DMA
controller to get out of sync somehow? Still, I wouldn't
think this would lock up, just output garbage data to the SB.
I apologize; I misrepresented what the handler does when it is told to
"do nothing". I have a flag in memory that the interrupt handler
checks on entrance. If it's set "enabled", the handler updates the
video and audio buffers. If set "disabled", the handler fills the
audio buffers with silence. Either way, it isn't actually disabled,
but "told" to do a "NOP"-type of loop. This type of control is
necessary so that the main program code can "pause" the interrupt
handler's functions if the data queue becomes empty and needs time to
fill up again.
Jim Leonard
2007-07-06 03:38:11 UTC
Permalink
Post by Bob Masta
SB data is normally transferred via
DMA (except for a crappy old 8-bit mode), and the interrupt is
just to tell you that the DMA block is done.
This just dawned on me: My SB uses DMA 1... and I think the hard disk
controller does too. So the SB IRQ interrupting a disk transfer
already in progress is probably what is locking the machine up (DMA is
probably getting dorked with mid-transfer).

You wrote DAQARTA, which took data in from a SB and wrote it to
disk... how did you get around this problem? Does your program try to
use both devices and/or DMA channels simultaneously?
Bob Masta
2007-07-06 12:18:48 UTC
Permalink
Post by Jim Leonard
Post by Bob Masta
SB data is normally transferred via
DMA (except for a crappy old 8-bit mode), and the interrupt is
just to tell you that the DMA block is done.
This just dawned on me: My SB uses DMA 1... and I think the hard disk
controller does too. So the SB IRQ interrupting a disk transfer
already in progress is probably what is locking the machine up (DMA is
probably getting dorked with mid-transfer).
You wrote DAQARTA, which took data in from a SB and wrote it to
disk... how did you get around this problem? Does your program try to
use both devices and/or DMA channels simultaneously?
I don't recall ever dealing with hard drive DMA. I think it wasn't
used much under DOS because it was so slow... REP INSW
or OUTSW were significantly faster on ISA-bus machines.
IBM set the DMA clock to half the bus speed on the AT, since
the DMA chips at the time were limited to 5 MHz and the AT
bus ran at 6 MHz. I think the half-speed tradition was carried
on by all the clones as well. I don't know what the current
state of affairs is on modern systems. A quick Google search
gives me the impression that hard drive DMA is an option you
may need to specifically enable.

Best regards,



Bob Masta

D A Q A R T A
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Signal Generator
Science with your sound card!
Jim Leonard
2007-07-06 15:57:39 UTC
Permalink
Post by Bob Masta
I don't recall ever dealing with hard drive DMA. I think it wasn't
used much under DOS because it was so slow... REP INSW
or OUTSW were significantly faster on ISA-bus machines.
Actually, that was only true on 286 machines or faster. The original
808x was slow enough that DMA was about 3x faster, so pretty much all
hard drive controllers on 808x machines use DMA. Including mine :-)
(I'm doing this on an 808x system as originally posted).

I'm going to investigate re-enabling interrupts as soon as the handler
is entered, and then somehow dealing with re-entrancy.
ArarghMail707NOSPAM
2007-07-06 20:31:57 UTC
Permalink
On Fri, 06 Jul 2007 12:18:48 GMT, ***@daqarta.com (Bob Masta)
wrote:
<snip>
Post by Bob Masta
I don't recall ever dealing with hard drive DMA. I think it wasn't
used much under DOS because it was so slow... REP INSW
or OUTSW were significantly faster on ISA-bus machines.
IBM set the DMA clock to half the bus speed on the AT, since
the DMA chips at the time were limited to 5 MHz and the AT
bus ran at 6 MHz. I think the half-speed tradition was carried
on by all the clones as well. I don't know what the current
state of affairs is on modern systems. A quick Google search
gives me the impression that hard drive DMA is an option you
may need to specifically enable.
ISTR that the original PC & XT hard drive controllers used DMA. It
was with the AT that INSW & OUTSW were used. And now with PCI we're
back to DMA, although I do think that it defaults to PIO unless
enabled.
--
ArarghMail707 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the extra stuff from the reply address.
HubbleBubble
2007-07-05 09:21:48 UTC
Permalink
Post by Jim Leonard
Some of you may remember me with my 808x questions from a while back.
I've put all that into an audio/video player for an old IBM PC (google
"8088 Corruption" for more info). I'm now trying to improve the
performance of the player, and running into some very odd things that
I can't figure out and was hoping someone might have some advice on
how to debug my program.
Background: The basic method of operation of the program is to drive a
Sound Blaster off of IRQ 2 (hard drive is IRQ 5), and the SB interrupt
handler pulls video and audio data out of a queue and displays/sounds
them. In the foreground, the main program is in a loop that
constantly tries to pull data from the hard disk and stuff it into the
queue. If memory fills up, the main program loop pauses. If memory
empties, the interrupt handler is told to "do nothing" until memory is
full with video/audio data again. There is unfortunately some shared
code (memory handling) between the main loop and the interrupt
handler, so I prevent re-entrancy problems by wrapping the shared
pieces in the main disk load loop with CLI/STI.
Currently, only one audio/video chunk is read per disk access, and the
program works. I tried to optimize it today by having the disk
portion read more than one audio/video chunk at a time. This
dramatically improves disk performance, but now the program locks up
after a random period of time seemingly INSIDE the INT 21,3f call to
read data from a file handle. I know this because I stuck various
"printf"s (not really, but you get the idea) before certain sections
of code, and the last thing I can see is the "marker" right before the
disk read.
How can I debug this sort of thing? I've tried to break the operation
of the program when it hangs in both D86 and Turbo Debugger, but being
on the 8088 there's no hardware virtualization and the machine does
not respond to my keypress. How did people debug interrupt handlers,
BIOS functions, etc. "back in the day"?
I can provide full source to anyone who's willing to look at it, if it
will help... It's Turbo Pascal 7.0 with inline assembler and highly
commented. Any advice is appreciated!
What kind of signaling are you using to determine when your buffer is
empty and full? Are you sure you are not getting a classic producer/
consumer semaphore contention where it is possible for the buffer to
think its both full and empty at the same time? When you are dealing
with single chunks its essentially an asynchronous operation but once
you start reading multiple chunks into a buffer and then extracting
from a buffer it introduces synch problems. The shared code section is
probably the danger zone. You should separate file read and buffer
read operations into critical regions to avoid any possibilty of
contention - see Tannenbaums Fundamentals of OS's for a full
explanation.

P.S. are you the same Jim Leonard I remember from the IBasic bbs/
Einstein club?
Jim Leonard
2007-07-05 17:12:12 UTC
Permalink
Post by HubbleBubble
What kind of signaling are you using to determine when your buffer is
empty and full? Are you sure you are not getting a classic producer/
consumer semaphore contention where it is possible for the buffer to
think its both full and empty at the same time? When you are dealing
That's an interesting thought, but the flag is a binary (ie. set or
unset) variable and is only set by the main program code. So I don't
see how that condition (both set and unset) could be met.
Post by HubbleBubble
P.S. are you the same Jim Leonard I remember from the IBasic bbs/
Einstein club?
It sounds vaguely familiar... where/when was it? Was the BBS and/or
club based in Illinois?
hartnegg
2007-07-05 15:11:22 UTC
Permalink
Post by Jim Leonard
How did people debug interrupt handlers,
BIOS functions, etc. "back in the day"?
When I have the source of an interrupt procedure, I let it write 65
to B800:0000 and increment that at critical spots.
When the program freezes, the character in the upper left part of the
screen shows how far it came.

Stepping through Dos or Bios function is a pain, but with Turbo
Debugger you can set a breakpoint at the memory address that an
interrupt points to. Use SwapVectors and GetIntVec or look at 0:84
(0:$54) to find the address. Go to that address in the debugger, I
think the key-combination is Alt-F10 G, then set the breakpoint. Next
time the interrupt is called, it should be triggered.
But I don't know if this works with Int 21 because the debugger itself
also needs to call that, so it could break its own operation...

Also in your case that problem probably only occurs when your
interrupt is triggered while the Dos function is still running. That
won't happen when you are stepping through it, or if it does, all will
crash.
Post by Jim Leonard
I tried to optimize it today by having the disk
portion read more than one audio/video chunk at a time. This
dramatically improves disk performance, but now the program locks up
after a random period of time seemingly INSIDE the INT 21,3f call to
read data from a file handle.
I guess that something that you do in the interrupt disturbs DOS. For
example if that interrupts also calls a Dos function or you forgot to
save a register or your stack overflows.

hope this helps,
Klaus
Jim Leonard
2007-07-05 21:49:30 UTC
Permalink
Post by hartnegg
Stepping through Dos or Bios function is a pain, but with Turbo
Debugger you can set a breakpoint at the memory address that an
interrupt points to. Use SwapVectors and GetIntVec or look at 0:84
(0:$54) to find the address. Go to that address in the debugger, I
think the key-combination is Alt-F10 G, then set the breakpoint. Next
time the interrupt is called, it should be triggered.
I just realized this will work for DOS interrupts, but not BIOS. You
can't set a breakpoint in ROM :-) But at least I can trace through
the DOS code.
Ed Beroset
2007-07-07 15:21:44 UTC
Permalink
Post by Jim Leonard
How can I debug this sort of thing? I've tried to break the operation
of the program when it hangs in both D86 and Turbo Debugger, but being
on the 8088 there's no hardware virtualization and the machine does
not respond to my keypress. How did people debug interrupt handlers,
BIOS functions, etc. "back in the day"?
I'm not sure I can help with your specific code problem but if the
program works when you read only one chunk of data at a time from disk,
but locks up after a while if you read multiple chunks, then probably
one of two things is happening -- some memory is being corrupted (you
mentioned a shared memory allocation piece) or an interrupt is taking
too long somewhere. If you have an oscilloscope, one nice way to debug
interrupt problems on such hardware is to modify your interrupt routine
so that it toggles a pin that you can look at with a 'scope. When I did
such stuff, the printer port was often most convenient. Set a line
"high" when you enter the interrupt and set it "low" when leaving. Then
you can look at a 'scope trace on that pin to see exactly how long the
interrupt takes. Extending that, you can have multiple pins and
multiple interrupts.

Another technique is to do something like a C "assert." The way that
works in C is that you write something like "assert (count <=
MAX_COUNT);" and that line of code will then verify that the expression
is true. If it is, then the code proceeds as normal. If it isn't, then
in the traditional C model it would execute a macro which writes the
name of the source file and the line number and calls the abort()
function. That's usually not wise to attempt within an interrupt but
you can simulate such a thing by doing something like writing to a
fixed, preallocated error log in memory, or to a hardware register.
Again you might find the printer port convenient for this, or if you
have at least an EGA compatible display on your hardware, you can do
things like change the background color of the screen by writing to the
overcan register (naturally this assumes you have a CRT and not an LCD).
If you just have one or two asserts, you can write to the keyboard's
LED register and change, say, the NUMLOCK and CAPSLOCK LEDs. Another
hardware-oriented technique, common among BIOS writers some years ago,
was to send hex byte to i/o port 80h. A plug-in hardware card then
decoded that address and displayed the hex byte on two 7-segment LED
displays on the card. If you happen to have such a card (they were
often sold as "BIOS diagnostic cards" in the 80's and 90's) or could get
one or build one, that's a convenient method. Unlike the parallel port
address, which requires that you load the dx register with the address
of the port and then do an out dx,al instruction, the 80h address can be
used in one step as out 80h, al, saving a few bytes, a few cycles, and
most importantly, the need to preserve and restore the dx register.

So to put this together, let's say that you want to assure that the DMA
is never already in progress when IRQ 2 starts being serviced. You can
write a little code to check for that condition at the start of the
interrupt, and if it is true, change the background color of the display
to red. When the machine locks up, just check the color of the screen.
If it's the normal black color, you'll know that your "assert" didn't
fire and you need to hunt elsewhere. If it's red, you'll have found
your culprit and you can dig deeper to figure out why and what to do
about it.

Hope that helps.

Ed
Jim Leonard
2007-07-09 16:29:00 UTC
Permalink
Post by Ed Beroset
If it's the normal black color, you'll know that your "assert" didn't
fire and you need to hunt elsewhere. If it's red, you'll have found
your culprit and you can dig deeper to figure out why and what to do
about it.
Excellent advice, I hadn't thought of the parallel port strobe.

Unfortunately, I'm already doing this (changing overscan color), and
it locks up during the DOS INT 21,3F (read data from file handle)
call. It just never comes back.

I can only assume my interrupt is taking too long then -- long enough
to interrupt a longer disk transfer. So I'll have to figure out some
other method of speeding up my player, I guess.

Since I have access to old hardware, I just remembered a method that
I've always wanted to try: Connected a monochrome card + monitor to
the machine. Since the memory addresses of a monochrome card don't
overlap with CGA (or VGA for that matter), I can drive both at the
same time. I think I'll print all sorts of debug information to the
monochrome screen while the program is running, to hopefully try to
catch what's going on.
Jim Leonard
2007-07-09 18:05:16 UTC
Permalink
Post by Jim Leonard
Unfortunately, I'm already doing this (changing overscan color), and
it locks up during the DOS INT 21,3F (read data from file handle)
call. It just never comes back.
I should clarify this is NOT in the interrupt; this is the mainline
code. No doubt it is the interrupt firing in the middle of the disk
transfer that is doing this, but I'm at a loss as to how to prevent
the read from locking up the machine, since the interrupt has a fixed
amount of time it *must* take to execute.
Bob Masta
2007-07-10 11:36:19 UTC
Permalink
Post by Jim Leonard
Post by Jim Leonard
Unfortunately, I'm already doing this (changing overscan color), and
it locks up during the DOS INT 21,3F (read data from file handle)
call. It just never comes back.
I should clarify this is NOT in the interrupt; this is the mainline
code. No doubt it is the interrupt firing in the middle of the disk
transfer that is doing this, but I'm at a loss as to how to prevent
the read from locking up the machine, since the interrupt has a fixed
amount of time it *must* take to execute.
Can you reduce the time the interrupt handler needs, even
as a test that doesn't yield normal functionality, to see if that
reduces the problem?

As I recall, the interrupt handler should just acknowledge the SB IRQ
and the system IRQ, and possibly change a pointer to the new
buffer the main code just filled from disk. Are you trying to
do significantly more than that?

Best regards,





Bob Masta

D A Q A R T A
Data AcQuisition And Real-Time Analysis
www.daqarta.com
Scope, Spectrum, Spectrogram, Signal Generator
Science with your sound card!
Jim Leonard
2007-07-10 15:43:53 UTC
Permalink
Post by Bob Masta
Post by Jim Leonard
Post by Jim Leonard
Unfortunately, I'm already doing this (changing overscan color), and
it locks up during the DOS INT 21,3F (read data from file handle)
call. It just never comes back.
I should clarify this is NOT in the interrupt; this is the mainline
code. No doubt it is the interrupt firing in the middle of the disk
transfer that is doing this, but I'm at a loss as to how to prevent
the read from locking up the machine, since the interrupt has a fixed
amount of time it *must* take to execute.
Can you reduce the time the interrupt handler needs, even
as a test that doesn't yield normal functionality, to see if that
reduces the problem?
That's a good idea; I hadn't thought of that. I will do that and see
if the lock-ups still occur. I suppose I can also try testing the
"InDOS" flag (undocumented?) and NOP if DOS is active, although that
will cause significant disruption to the program (see below).
Post by Bob Masta
As I recall, the interrupt handler should just acknowledge the SB IRQ
and the system IRQ, and possibly change a pointer to the new
buffer the main code just filled from disk. Are you trying to
do significantly more than that?
Yes -- it's a video player, so the video data is updated during that
interrupt as well. The video update always runs in fixed time (ie. it
doesn't take longer or shorter based on input from disk) so the
current scheme -- very short disk transfers (no more than about 4K at
a time) interleaved with the audio/video update interrupt fired by the
sound blaster -- works perfectly. The problem only occurs when I
attempt better disk read speeds using larger disk transfers, such as
10K or more (ie. several chunks in one large request). This results
in the lockup after a few seconds.

One of the goals for the player was 100% perfect audio/video sync, so
that's why the video is updated in the audio interrupt. If I moved
the video update portion to the mainline disk read loop (ie.
interleaving updates with disk reads), it would undoubtedly suffer
from significant jitter, although I suppose I could offer that mode to
the user as a compatibility option.

More coding for me tonight :-)
Continue reading on narkive:
Loading...