High level - low level
(too old to reply)
Robert Prins
2021-08-08 16:42:43 UTC
Again a post that never showed up. Replies to clax86 seem to work, new postings
do not???

-------- Forwarded Message --------
Subject: High level - low level
Date: Sun, 8 Aug 2021 11:07:28 +0000
From: Robert Prins <robert(*)prino.org>
Newsgroups: comp.lang.asm.x86

After nearly six months, I've started having another look at the code in my
hitchhike statistics programs, see lift32bit.rar, password "lift" @

The programs were originally written in Turbo Pascal V3.01a, changed to use its
"inline()", byte-encoded assembler, converted to pure Pascal again for Turbo
Pascal 6.0, then changed to use inline assembler, converted to Virtual Pascal
(with a surprisingly small effort to convert the inline assembler code), and
right now they are probably well over 95% inline assembler, which in many cases
no longer bears any resemblance to the original compiler generated code.

The reason for returning to the code? One heading needed updating, and jQuery,
used in one of the generated html files, has been upgraded to V3.6.0. Obviously
I decided to have another look at the code, and while doing so, I realised that
there are three places where I do min/max compares on three contiguous longint
variables, and hey, weren't there some new instructions, "vp[max/min]sd" to do
those in parallel? A bit of restructuring, to move a fourth irrelevant longint
after the three already there, allowed me to use them, and that makes me wonder:

Would a modern C(++) compiler be able to do such a transformation? Obviously not
by moving the fields in the structure, but in this case using a (V)PMAXSD
variant to compare just two of the three, even if the code first does max/min
compares on a per field basis?

Which of course leads to the more general question, if I were to convert, I
still keep copies of the "Pure Pascal" or PL/I (also on my Google Drive,
"lift.pli") versions of the program to C, how would the generated code (MSVC,
GCC, Intel, all "32-bit + opt(max)") compare to what I've been hacking together,
given that my assembler skills might just be a bit over average? Maybe someone
could show me what a C compiler would generate for

var n : double; {Local temporary}
var c : double; {Local temporary}
var yyyy: longint;
var mm : longint;
var dd : longint;
var jdn : longint;

n:= 1.0 * jdn - 1721119.2;
c:= trunc(n / 36524.25);

if jdn < 2299161 then
n:= n + 2
n:= n + c - trunc(c / 4);

yyyy:= trunc(n / 365.25);
n := n - trunc(365.25 * yyyy) - 0.3;
mm := trunc(n / 30.6);
dd := trunc(n - 30.6 * mm + 1);

if mm > 9 then
dec(mm, 9);
inc(mm, 3);

Would these compilers be smart enough to? ;)

And how would you code this in 32-bit assembler, when you can use your grey-cell
based RI processor?

Or a Heapsort where the elements to be sorted are pointers in an array on the
heap, pointing to the sort fields, i.e. something like

while (not ready) and
(_j <= n) do
if _j < n then
_k:= succ(_j);

if (sort^[_j]^.major < sort^[_k]^.major) or
(sort^[_j]^.major = sort^[_k]^.major) and
(sort^[_j]^.minor < sort^[_k]^.minor) then

if (rra^.major < sort^[_j]^.major) or
(rra^.major = sort^[_j]^.major) and
(rra^.minor < sort^[_j]^.minor) then
sort^[_i]:= sort^[_j];
_i := _j;
_j := _j * 2;
_j:= succ(n);

where major and minor are longint fields, and again as 32-bit code.

Let's just say that the code generated by VP for the above is not much better
than the code originally generated by TP 3.01a...

So where do compilers stand today, if you've read this snippet from Paul Hsieh:

For example, how much faster and/or smaller would an x86/AMD64 assembler
implementation of <http://www.jhauser.us/arithmetic/SoftFloat.html> be when
written in assembler?

Robert AH Prins
The hitchhiking grandfather - https://prino.neocities.org/indez.html
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html
Frank Kotler
2021-08-08 21:31:29 UTC
Post by Robert Prins
Again a post that never showed up. Replies to clax86 seem to work, new
postings do not???
Sorry, Robert. I'm "pretty" sore posts that make it to the submission
mailbox do get posted. Nothing I can do, if not.