Discussion:
IA32 vs IA64
(too old to reply)
Bryan Parkoff
2003-11-18 18:46:10 UTC
Permalink
I understand that IA32 is CISC that it has approximately 65,000
instructions out of 16,777,215 instructions are present. More new
instructions will be added each year. Old instructions such as MOV that use
8 bit and 16 bit are still useable in order to be compatible with previous
software.
It looks like that Intel plans to develop Pentium V that will have 20.0
GHz in the year 2020 AD. It is what I read article from PC Magazine. I
guess that IA32 is still useable for games, but I have no idea that games
might require more than 4GB memory for best 3D graphic.
I understand that IA64 is RISC that it has only 256 instructions. 256
instructions must use 64 bit only, but it still can use 32 bit, 16 bit, or 8
bit by using AND instruction. 64 bit, 32 bit, 16 bit and 8 bit flags should
be added beside S, O, Z, C, P, and A flags. XCE instruction can be used to
switch 64 bit, 32 bit, 16 bit or 8 bit flags. Each 256 instructions can
read/write data to 64 bit, 32 bit, 16 bit and 8 bit without necessarily
using AND instruction. It looks similar to 65816 CPU that it is very close
to RISC. I guess that IA64 can support up to (16MB) TB. It is a very huge
memory than IA32 can support up to 4 GB.
IA64 is useable for the server only and probably for workstation. I
have no idea if people are moving from IA32 to IA64 for personal use, but
IA32 is still useable for another 20 years. Please provide your opinion
what you think.
--
Bryan Parkoff
Bryan Bullard
2003-11-18 20:01:10 UTC
Permalink
"Bryan Parkoff" <***@nospam.com> wrote in message news:SNtub.1439$***@twister.austin.rr.com...

i think in 20 years, chip architectures will be inconceivably more
sophisticated than they are now.

if moore's law holds up (as it has the previous 20 years) chips will cycle
around 12,000 GHz in 20 years.

-bryan
Post by Bryan Parkoff
I understand that IA32 is CISC that it has approximately 65,000
instructions out of 16,777,215 instructions are present. More new
instructions will be added each year. Old instructions such as MOV that use
8 bit and 16 bit are still useable in order to be compatible with previous
software.
It looks like that Intel plans to develop Pentium V that will have 20.0
GHz in the year 2020 AD. It is what I read article from PC Magazine. I
guess that IA32 is still useable for games, but I have no idea that games
might require more than 4GB memory for best 3D graphic.
I understand that IA64 is RISC that it has only 256 instructions. 256
instructions must use 64 bit only, but it still can use 32 bit, 16 bit, or 8
bit by using AND instruction. 64 bit, 32 bit, 16 bit and 8 bit flags should
be added beside S, O, Z, C, P, and A flags. XCE instruction can be used to
switch 64 bit, 32 bit, 16 bit or 8 bit flags. Each 256 instructions can
read/write data to 64 bit, 32 bit, 16 bit and 8 bit without necessarily
using AND instruction. It looks similar to 65816 CPU that it is very close
to RISC. I guess that IA64 can support up to (16MB) TB. It is a very huge
memory than IA32 can support up to 4 GB.
IA64 is useable for the server only and probably for workstation. I
have no idea if people are moving from IA32 to IA64 for personal use, but
IA32 is still useable for another 20 years. Please provide your opinion
what you think.
--
Bryan Parkoff
Matt Taylor
2003-11-18 23:06:52 UTC
Permalink
Post by Bryan Parkoff
I understand that IA32 is CISC that it has approximately 65,000
instructions out of 16,777,215 instructions are present. More new
instructions will be added each year. Old instructions such as MOV that use
8 bit and 16 bit are still useable in order to be compatible with previous
software.
No. Counting all *forms* of all instructions (not including immediate
bytes), the total comes to around 6 million. Counting extensions made by
Cyrix and AMD, the total number of unique encodings is around 1,000. There
are far fewer instructions than that, and most of those instructions belong
to MMX and SSE. Excluding MMX and SSE, there are probably only 100-200
instructions.

Characteristics of CISC:
1. Specialized registers (esi, edi used for string pointers; ebx used in
xlat; ecx used for count; edx:eax used for multiply/divide)
2. Few registers (x86 has 8)
3. Complex and specialized instructions (bound, string instructions, xlat,
jcxz)
4. Variable-length encoding (eax forms usually shorter; many instructions
have "hardwired" operands)

To compare with SPARC:
1. General-purpose registers (%o6, %o7, %i6, %i7 used in calling convention;
%g0 is always 0)
2. Large number of registers (32 registers visible at a time; up to 32
frames of 16 registers + 8 global registers)
3. General, simple instructions (no mov, only ld & st; many "synthetic"
instructions built out of other instructions)
4. Simple, constant-length instructions (all instructions are 4 bytes,
bitfields always in the same place)

Yes, x86 is backward-compatible. Backward compatibility is absolutely
essential. Other architectures like MIPS, SPARC, and IA-64 are also
backward-compatible.
Post by Bryan Parkoff
It looks like that Intel plans to develop Pentium V that will have 20.0
GHz in the year 2020 AD. It is what I read article from PC Magazine. I
guess that IA32 is still useable for games, but I have no idea that games
might require more than 4GB memory for best 3D graphic.
AMD seems to think we need 64-bits on the desktop right now. Most desktop
systems do not need more than 4 GB of ram, but x86-64 has certain
performance enhancements such as additional registers.

Intel has their reasons for not introducing 64-bit computing to the desktop,
but despite their company line, "It is not necessary," they are very
interested in moving to 64-bit computing. They are so interested that they
developed their own 64-bit x86 extensions (not IA-64), but Microsoft told
Intel they would not maintain 3 versions of 64-bit Windows. I have read that
Prescott (Pentium 5) will probably have 64-bit extensions to be enabled
later. We'll have to wait and see.

They are quite right that 64-bit computing is not generally useful on the
desktop. A few things like RSA benefit, but for most programs it just wastes
memory. This is why Microsoft uses a 32-bit long even on 64-bit platforms.
To use 64-bit integers, you have to use an __int64.
Post by Bryan Parkoff
I understand that IA64 is RISC that it has only 256 instructions. 256
instructions must use 64 bit only, but it still can use 32 bit, 16 bit, or 8
bit by using AND instruction. 64 bit, 32 bit, 16 bit and 8 bit flags should
be added beside S, O, Z, C, P, and A flags. XCE instruction can be used to
switch 64 bit, 32 bit, 16 bit or 8 bit flags. Each 256 instructions can
read/write data to 64 bit, 32 bit, 16 bit and 8 bit without necessarily
using AND instruction. It looks similar to 65816 CPU that it is very close
to RISC. I guess that IA64 can support up to (16MB) TB. It is a very huge
memory than IA32 can support up to 4 GB.
I don't know how many instructions IA-64 has, but RISC does not mean 256
instructions. RISC means the CPU uses instructions that are easy to
implement in hardware. For a long time SPARC didn't even have a multiply
instruction. Nowadays transistor density is so high that it would be silly
not to include a hardware multiplier.

Because of two's complement, it is unnecessary to have 8-bit, 16-bit, and
32-bit registers. You can add and subtract and the signed overflow flag (OF)
will be set correctly. Unsigned overflow (CF) can be easily detected.
Post by Bryan Parkoff
IA64 is useable for the server only and probably for workstation. I
have no idea if people are moving from IA32 to IA64 for personal use, but
IA32 is still useable for another 20 years. Please provide your opinion
what you think.
Itanium costs $3,000 per CPU. Athlon64 costs $400 right now, and the prices
should drop significantly next quarter when yields rise. Unless the price of
Itanium is drastically cut, I will buy an Athlon64 and not an Itanium.

Despite 4 GB limits, most of the mid-range servers are Xeon-based and not
Itanium-based. Xeons also have a large pricetag, but an $800 Xeon is much
cheaper than a $3,000 Itanium. Also, Xeons will run IA-32 software faster
than a 486. Opterons have not really stolen a lot of market share from
Itanium, but I think that will change in time.

-Matt
Grumble
2003-11-19 12:08:49 UTC
Permalink
Post by Matt Taylor
Itanium costs $3,000 per CPU.
Which one are you talking about? :-)

There are 5 Itanium 2 implementations available:

1.0 GHz + 1.5 MB L3
1.4 GHz + 1.5 MB L3
1.3 GHz + 3 MB L3
1.4 GHz + 4 MB L3
1.5 GHz + 6 MB L3

http://www.intel.com/itanium2/
Post by Matt Taylor
Athlon64 costs $400 right now
Apparently, a 3000+ version is available for only $278:
http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_609,00.html

What's Desktop Replacement?
Post by Matt Taylor
and the prices should drop significantly next quarter when yields rise.
The same is true for Itanium, don't you think? :-)
Post by Matt Taylor
Despite 4 GB limits, most of the mid-range servers are Xeon-based
64 GB - not 4 GB - with PAE.
http://x86.ddj.com/articles/2mpages/2mpages.htm
Matt Taylor
2003-11-19 20:56:15 UTC
Permalink
Post by Grumble
Post by Matt Taylor
Itanium costs $3,000 per CPU.
Which one are you talking about? :-)
1.0 GHz + 1.5 MB L3
1.4 GHz + 1.5 MB L3
1.3 GHz + 3 MB L3
1.4 GHz + 4 MB L3
1.5 GHz + 6 MB L3
http://www.intel.com/itanium2/
http://www.hp.com/workstations/itanium/zx2000/

$3,000 is a rough estimate. Several years ago the cost per chip for the
Merced core was about $2,800. The price of the above workstation is a little
over $3,000. I would expect that quote to assume a 1 GHz or 900 MHz Itanium,
and the cost to HP is probably $1,000-$1,500 for the CPU. A 1 GHz Itanium is
easily beaten by a 2.2 GHz AthlonFX ($750ish).

The point is that the cost of an Itanium is exorbitant compared to the cost
of an Athlon64 or even a Pentium 4.
Post by Grumble
Post by Matt Taylor
Athlon64 costs $400 right now
http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_609,00.html
Post by Grumble
What's Desktop Replacement?
$278 per chip if you buy a tray of 1,000. The price cut was recent, and
vendors still want about $400 for the chip.

"Desktop Replacement" is supposed to be their high-end chips. The term
"desktop replacement" represents the idea of replacing all the functions of
a desktop with a laptop. I disagree because I would rather a laptop be
mobile than simply portable. High-end laptops burn through their batteries
extremely quickly (some less than 2 hours) and tend to be heavier. A
Centrino laptop can last as long as 5 hours and is much lighter. Some
Crusoe-based laptops have capacity for 4 batteries. They'll last up to 20
hours without being plugged in.

I find that I often RDP into my desktop to do serious work. The laptop is
simply a portable terminal for me. Some of the work I have done over the
past 2 months simply would not have been possible on a laptop.
Post by Grumble
Post by Matt Taylor
and the prices should drop significantly next quarter when yields rise.
The same is true for Itanium, don't you think? :-)
It hasn't fallen into the PC price range over the past 4 or 5 years, so I
would guess no. Intel marks up the Itanium because it is a server chip and
people who want extreme compute power are willing to pay those prices.
Itanium runs circles around the Xeon both in FP performance and in I/O
bandwidth. That HP machine I linked to has 6.4 GB/sec of bandwidth for a
single Itanium. The high-end 4-way Xeon systems can get up to 9 GB/sec or so
of bandwidth.
Post by Grumble
Post by Matt Taylor
Despite 4 GB limits, most of the mid-range servers are Xeon-based
64 GB - not 4 GB - with PAE.
http://x86.ddj.com/articles/2mpages/2mpages.htm
4 GB per process is the critical limit of x86 and all 32-bit machines. On
both Windows and Linux, the OS uses 1-2 GB of that address space. That PAE
(and PSE36) can go beyond 4 GB is almost irrelevant.

Tim Sweeney made some very interesting comments which I wholeheartedly agree
with on the use of PSE36 and PAE to access more than 4 GB (well, 2 GB) of
memory. AMD reprints it here:
http://www.amd.com/us-en/Weblets/0,,7832_8366_7823_8718^8320,00.html

-Matt
Grumble
2003-11-20 09:31:22 UTC
Permalink
Post by Matt Taylor
Post by Grumble
Post by Matt Taylor
Itanium costs $3,000 per CPU.
Which one are you talking about? :-)
1.0 GHz + 1.5 MB L3
1.4 GHz + 1.5 MB L3
1.3 GHz + 3 MB L3
1.4 GHz + 4 MB L3
1.5 GHz + 6 MB L3
http://www.intel.com/itanium2/
http://www.hp.com/workstations/itanium/zx2000/
They don't sell bare CPUs, they sell complete systems.

You are comparing the price of a complete system (which comes with
support) with the price of a bare CPU.

However, I will not argue that Itanium 2 chips are cheap! :-)
Post by Matt Taylor
The point is that the cost of an Itanium is exorbitant compared to the cost
of an Athlon64 or even a Pentium 4.
I'm not so sure. Have you seen the price of the Extremely Expensive
Pentium 4 with 2 MB L3 cache? :-)
Matt Taylor
2003-11-20 21:16:44 UTC
Permalink
Post by Grumble
Post by Matt Taylor
Post by Grumble
Post by Matt Taylor
Itanium costs $3,000 per CPU.
Which one are you talking about? :-)
1.0 GHz + 1.5 MB L3
1.4 GHz + 1.5 MB L3
1.3 GHz + 3 MB L3
1.4 GHz + 4 MB L3
1.5 GHz + 6 MB L3
http://www.intel.com/itanium2/
http://www.hp.com/workstations/itanium/zx2000/
They don't sell bare CPUs, they sell complete systems.
You are comparing the price of a complete system (which comes with
support) with the price of a bare CPU.
However, I will not argue that Itanium 2 chips are cheap! :-)
Granted, but I'm not trying to pin an absolute figure on Itanium, either. I
guessed $1,000-$1,500. I'm not sure where wholesale Itanium 2 prices would
be listed or I would simply have looked it up.

Original Merced was pretty steep, but it seems Intel has dramatically cut
the prices on Itanium 2 to get it into the market. Merced wasn't very well
accepted because it cost so much.
Post by Grumble
Post by Matt Taylor
The point is that the cost of an Itanium is exorbitant compared to the cost
of an Athlon64 or even a Pentium 4.
I'm not so sure. Have you seen the price of the Extremely Expensive
Pentium 4 with 2 MB L3 cache? :-)
I have. It's something like $900, isn't it? It's a bit ridiculous. However,
comparatively priced Itanium 2s will be beaten hands-down by comparatively
priced P4s and Athlons. The 1 GHz Itanium 2 is creamed by the 1.8 GHz
Opteron, and the 1.8 GHz Opteron is sub-$1,000. There is now a 2.2 GHz
Opteron for $913.

I should have pointed out the TPC scores before. Opteron will beat Itanium 2
by as much as a factor of 4 in performance/price. (Figures below are
workstation prices, not actual CPU prices.)

http://www.amd.com/us-en/Processors/ProductInformation/0,,30_118_8796_8800,00.html

They used to have quantitative price/performance figures, but it's still
pretty obvious which is cheaper. Opteron is priced alongside Xeon, and it
gives Itanium-like performance.

-Matt
Nudge
2003-11-18 23:10:17 UTC
Permalink
Post by Bryan Parkoff
I understand that IA32 is CISC
Modern x86 processors have a RISC micro-architecture, though.
Post by Bryan Parkoff
that it has approximately 65,000 instructions
Where did you pull that number from?
Post by Bryan Parkoff
out of 16,777,215 instructions are present.
Where did you pull *that* number from?
Post by Bryan Parkoff
It looks like that Intel plans to develop Pentium V that will
have 20.0 GHz in the year 2020 AD. It is what I read article
from PC Magazine.
And everything they say is true?
Post by Bryan Parkoff
I understand that IA64 is RISC that it has only 256 instructions.
For the sake of argument, let's say IA-64 defines 256 instructions.
Would you argue that 256 instructions is a *reduced* instruction
set? Would you argue that an instruction set which defines a
population count (popcnt) instruction is a reduced instruction set?
Post by Bryan Parkoff
XCE instruction can be used to switch 64 bit, 32 bit, 16 bit or 8
bit flags.
I've never seen an XCE instruction in IA-32 or IA-64.
Post by Bryan Parkoff
I guess that IA64 can support up to (16MB) TB. It is a very huge
memory than IA32 can support up to 4 GB.
And... what is your point?

Bryan,

I don't know if this is a language problem, but you sound like a
raving lunatic most of the time. Perhaps you just need to put a
little bit more work into each post?

Nudge
Bryan Parkoff
2003-11-19 12:08:03 UTC
Permalink
Nudge,
Post by Nudge
Post by Bryan Parkoff
that it has approximately 65,000 instructions
Where did you pull that number from?
Yes, it is approximately 65,000 instructions that x86 CISC has. I did
check Intel Microprocessors textbook. It looks like that MOV has a list of
general register including memory address. It is impossible to list MOV
because it is too many instructions.
Post by Nudge
Post by Bryan Parkoff
out of 16,777,215 instructions are present.
Where did you pull *that* number from?
Intel uses "0F" before instructions like Jcc. It uses 16 bit
instruction to support short and long branch. I state that it is out of
16,777,215 instructions. If 16 bit instructions are all taken, 24 bit
instructions will be used. It looks like xx xx xx before general register
and memory address. I can't remember what I read HLA manual a long time
ago. I will review and post later.
Post by Nudge
Post by Bryan Parkoff
It looks like that Intel plans to develop Pentium V that will
have 20.0 GHz in the year 2020 AD. It is what I read article
from PC Magazine.
And everything they say is true?
I meant that Intel has plans to develop Pentium V that will reach
20.0GHz in the year 2020AD. It is their hope, but it does not exist yet.
We will wait another 17 years to see for ourselves.
Post by Nudge
Post by Bryan Parkoff
I understand that IA64 is RISC that it has only 256 instructions.
For the sake of argument, let's say IA-64 defines 256 instructions.
Would you argue that 256 instructions is a *reduced* instruction
set? Would you argue that an instruction set which defines a
population count (popcnt) instruction is a reduced instruction set?
I don't get involved in an argument, but I simply say that RISC is a
very simple. It is not necessary to have complex so RISC uses simple
instructions that has maximum of 256 instructions or 8 bit instruction. I
don't know IA64 very well, but Matt Taylor claims to be 100-200 instruction
total for CISC. If it is the case, CISC only use 8 bit instruction that is
maximum of 256 instructions. First 8 bit instruction is defined. Second
hex code is defined after 8 bit instructions for general register and memory
address. Third, Fourth, etc hex code is for our data.
Post by Nudge
Post by Bryan Parkoff
XCE instruction can be used to switch 64 bit, 32 bit, 16 bit or 8
bit flags.
I've never seen an XCE instruction in IA-32 or IA-64.
Do not say that you have never seen XCE instruction in IA-32. I have
never said that way. I state that 65816 has XCE instruction by switching 8
bit, 16 bit and 32 bit (32 bit does not exist like 65832 stopped
developing). I wish that IA64 will use XCE, but it is the way how IA64 is
designed.
I suspect that IA64 will not be backward compatible to 8 bit, 16 bit,
and 32 bit so it must remain 64 bit. You still can put 8 bit, 16 bit, or 32
bit into 64 bit register before AND instruction is used to mask 8 bit, 16
bit, or 32 bit.
Post by Nudge
Post by Bryan Parkoff
I guess that IA64 can support up to (16MB) TB. It is a very huge
memory than IA32 can support up to 4 GB.
And... what is your point?
My point is that IA64 has 64 bit memory address. It starts at
00000000:00000000 and ends at ffffffff:ffffffff. IA32 has only 32 bit and
36 bit memory address that only reaches up to 4GB or 64GB. It starts at
0000:0000 or 0:0000:0000. It ends at ffff:ffff or f:ffff:ffff. I guess
that IA32 has special memory mapping that can map 4GB into 64GB linear
address, but it still limits up to 4GB.
It would be nice if IA64 can manage up to 16MB TB for huge database and
communication in the server only.
Post by Nudge
Bryan,
I don't know if this is a language problem, but you sound like a
raving lunatic most of the time. Perhaps you just need to put a
little bit more work into each post?
I am sorry that I don't write English well. I am sure that Matt Taylor
understands most of my posts. It is only my second language.

Bryan Parkoff
Matt Taylor
2003-11-19 23:23:42 UTC
Permalink
Post by Bryan Parkoff
Nudge,
Post by Nudge
Post by Bryan Parkoff
that it has approximately 65,000 instructions
Where did you pull that number from?
Yes, it is approximately 65,000 instructions that x86 CISC has. I did
check Intel Microprocessors textbook. It looks like that MOV has a list of
general register including memory address. It is impossible to list MOV
because it is too many instructions.
The mov instruction has quite a few forms. Offhand: 88-8B, 8C, 8E, A0-A3,
B0-B7, B8-BF, C6, C7, 0F 20-0F 23, 0F 24, 0F 26, and possibly others, but I
probably got them all. You don't count 88 C0 (mov al, al) separately from 88
C9 (mov cl, cl). They are the same opcode with different operands. C6 and 8C
are different opcodes for the same mnemonic (mov).

The x86 architecture has 561 mnemonics (including most non-Intel/non-AMD
extensions) and roughly 1,000-1,200 different opcodes. The total number of
encodings (not including immediates) is less than 4.9 million. I gave the
figure of 6.1 million before, but I had forgotten that this includes a lot
of duplicates. The 4.9 million figure is an upper bound. It also has
duplicates, but it has fewer.

Most RISC machines have large instruction sizes (often 4-bytes; IA-64 is
8-bytes and 24-bytes per molecule). However, the number of opcodes is small,
and the instruction set is very orthogonal. That is, instructions can use
pretty much any register to do work.
Post by Bryan Parkoff
Post by Nudge
Post by Bryan Parkoff
out of 16,777,215 instructions are present.
Where did you pull *that* number from?
Intel uses "0F" before instructions like Jcc. It uses 16 bit
instruction to support short and long branch. I state that it is out of
16,777,215 instructions. If 16 bit instructions are all taken, 24 bit
instructions will be used. It looks like xx xx xx before general register
and memory address. I can't remember what I read HLA manual a long time
ago. I will review and post later.
0F is a sort of prefix. It doubles the number of possible opcodes in the ISA
to 512. (3DNow encodes 0F 0F, so 768 possible if you count that.) Quite a
few instructions are encoded in the reg field of the ModR/M byte, so there
are more possibilities. An upper bound estimate is 768*2^3 = 6,144 maximum
opcodes. The actual number is much smaller because the reg field is not
frequently used to encode opcodes. Also, the 3DNow map is largely empty.
Only 20-30 of the 256 possible 3DNow opcodes are actually used.

<snip>
Post by Bryan Parkoff
Post by Nudge
Post by Bryan Parkoff
I understand that IA64 is RISC that it has only 256 instructions.
For the sake of argument, let's say IA-64 defines 256 instructions.
Would you argue that 256 instructions is a *reduced* instruction
set? Would you argue that an instruction set which defines a
population count (popcnt) instruction is a reduced instruction set?
I don't get involved in an argument, but I simply say that RISC is a
very simple. It is not necessary to have complex so RISC uses simple
instructions that has maximum of 256 instructions or 8 bit instruction. I
don't know IA64 very well, but Matt Taylor claims to be 100-200 instruction
total for CISC. If it is the case, CISC only use 8 bit instruction that is
maximum of 256 instructions. First 8 bit instruction is defined. Second
hex code is defined after 8 bit instructions for general register and memory
address. Third, Fourth, etc hex code is for our data.
<snip>

Nudge, the popcount algorithm isn't actually that complex. Many machines
offer that. Alpha did, and I believe MIPS also does. Windows CE even has a
"common" intrinsic for it. I'd imagine the circuit minimizes quite nicely.

Bryan, I said that x86 has 100-200 different instructions (mnemonics) not
counting MMX and SSE. There are many CISC processors, and some have fewer
than 100-200 instructions.

For x86 specifically, the encodings are variable-length. Some have implicit
operands (like B0-BF). The longest x86 instruction is 15 bytes. I would
suggest picking up an Intel manual and reading through the part on ModR/M
and SIB encoding. Appendix A has the opcode tables.

-Matt
Nudge
2003-11-20 07:00:04 UTC
Permalink
Most RISC machines have large instruction sizes (IA-64 is 8-bytes
and 24-bytes per molecule).
In IA-64, an instruction bundle is 128 bits wide. A bundle contains
three 41-bit instructions and a 5-bit template.

Itanium is not really a RISC design. The hardware is very complex.
Nudge, the popcount algorithm isn't actually that complex. Many
machines offer that. Alpha did, and I believe MIPS also does.
Windows CE even has a "common" intrinsic for it. I'd imagine the
circuit minimizes quite nicely.
What is the x86 equivalent then? :-)
Matt Taylor
2003-11-20 09:33:43 UTC
Permalink
Post by Nudge
Most RISC machines have large instruction sizes (IA-64 is 8-bytes
and 24-bytes per molecule).
In IA-64, an instruction bundle is 128 bits wide. A bundle contains
three 41-bit instructions and a 5-bit template.
Itanium is not really a RISC design. The hardware is very complex.
Strange, not sure where I got the 24-byte figure, but I recall looking at
some diagram that showed disassembly of instructions, and there were 3 *
8-bytes. Oh well...

The instruction set would be fairly simple except for all the logic to
handle speculation. I also wonder why the rotating register stack exposes
128 64-bit general registers. That seems a bit extreme.
Post by Nudge
Nudge, the popcount algorithm isn't actually that complex. Many
machines offer that. Alpha did, and I believe MIPS also does.
Windows CE even has a "common" intrinsic for it. I'd imagine the
circuit minimizes quite nicely.
What is the x86 equivalent then? :-)
I'm not sure what you're driving at since x86 is the CISC architecture and
several RISC architectures have it. I am mostly ignorant when it comes to
hardware, but it seems like it would be fairly easy to build. Perhaps x86
does not have it because it is extremely special-purpose. However, when it
is useful, the performance difference is immense.

A fast popcount instruction would make bsf completely unnecessary. The bsr
instruction is not as easily emulated, but bsf/bsr implementation is pretty
crappy on most x86 CPUs anyway. The Pentium and K6 use micro-coded loops.
Athlon and Pentium 4 use a binary search, but it's still 8-10 clocks. A
2-cycle popcount would give a 4-cycle bsf. If a bit-reverse instruction were
also added, bsr could be emulated in 6-cycles. This has the added bonus of
bit reverse and popcount being useful for other applications, too.

-Matt
Tim Roberts
2003-11-21 05:28:24 UTC
Permalink
Post by Matt Taylor
Post by Nudge
Post by Matt Taylor
Nudge, the popcount algorithm isn't actually that complex. Many
machines offer that. Alpha did, and I believe MIPS also does.
Windows CE even has a "common" intrinsic for it. I'd imagine the
circuit minimizes quite nicely.
What is the x86 equivalent then? :-)
I'm not sure what you're driving at since x86 is the CISC architecture and
several RISC architectures have it.
The CDC 6000, one of the earliest RISC processors (roughly 20
instructions), included a pop count. In the hardware, it was implemented
as a side effect from the floating point normalize unit.
--
- Tim Roberts, ***@probo.com
Providenza & Boekelheide, Inc.
Matt Taylor
2003-11-21 08:28:05 UTC
Permalink
Post by Tim Roberts
Post by Matt Taylor
Post by Nudge
Post by Matt Taylor
Nudge, the popcount algorithm isn't actually that complex. Many
machines offer that. Alpha did, and I believe MIPS also does.
Windows CE even has a "common" intrinsic for it. I'd imagine the
circuit minimizes quite nicely.
What is the x86 equivalent then? :-)
I'm not sure what you're driving at since x86 is the CISC architecture and
several RISC architectures have it.
The CDC 6000, one of the earliest RISC processors (roughly 20
instructions), included a pop count. In the hardware, it was implemented
as a side effect from the floating point normalize unit.
Don't you mean the equivalent of bsf? I've seen FP normalize used for
bsf-like function, but I've never heard of doing a popcount this way.

-Matt

Nudge
2003-11-20 07:06:00 UTC
Permalink
I don't know IA64 very well [...]
Then please stop making loony claims, based on a partial
understanding of what Matt said, and educate yourself:

http://intel.com/design/itanium/manuals.htm
Tim Roberts
2003-11-19 22:30:08 UTC
Permalink
Post by Bryan Parkoff
It looks like that Intel plans to develop Pentium V that will have 20.0
GHz in the year 2020 AD. It is what I read article from PC Magazine.
I think you misread that. Intel doesn't plan 15 years in the future, and
chip clocks are increasing much faster than that. At the current Moore's
law rate, 20GHz Pentiums should be available in 2007.
Post by Bryan Parkoff
IA64 is useable for the server only and probably for workstation. I
have no idea if people are moving from IA32 to IA64 for personal use, but
IA32 is still useable for another 20 years. Please provide your opinion
what you think.
That's probably pretty close. The 80386 came out in about 1988, and 15
years later we're still using the occasional 16-bit app.
--
- Tim Roberts, ***@probo.com
Providenza & Boekelheide, Inc.
Nudge
2003-11-19 22:44:00 UTC
Permalink
Post by Bryan Parkoff
I understand that IA32 is CISC that it has approximately 65,000
instructions out of 16,777,215 instructions are present.
You assume 3-byte (24-bit) opcode implies 2^24 possible encodings.
This is incorrect when variable-length encoding is used.

For example, assume I am designing an instruction set with 4
instructions I1, I2, I3, I4. After some research, I figure that, on
average

50% of instructions will be I1
30% of instructions will be I2
10% of instructions will be I3
10% of instructions will be I4

Encode I1 with 0
Encode I2 with 10
Encode I3 with 110
Encode I4 with 111

You'll notice I've used 3 bits instead of 2. But on average, an
instruction will take 1*0.5+2*0.3+3*0.2 = 1.5 bits instead of 2.

So you see, 24-bits does NOT mean 2^24 possible encodings.

And your figure of 65000 is nonsensical anyway.
a***@NOW.AT.arargh.com
2003-11-20 02:22:35 UTC
Permalink
Post by Nudge
Post by Bryan Parkoff
I understand that IA32 is CISC that it has approximately 65,000
instructions out of 16,777,215 instructions are present.
You assume 3-byte (24-bit) opcode implies 2^24 possible encodings.
This is incorrect when variable-length encoding is used.
For example, assume I am designing an instruction set with 4
instructions I1, I2, I3, I4. After some research, I figure that, on
average
50% of instructions will be I1
30% of instructions will be I2
10% of instructions will be I3
10% of instructions will be I4
Encode I1 with 0
Encode I2 with 10
Encode I3 with 110
Encode I4 with 111
Thats almost exactly the way the Nova instruction set works:
0 00x xxx xxx xxx xxx - JMP, JSR, ISZ, DSZ
0 xxx xxx xxx xxx xxx - LDA, STA
0 11x xxx xxx xxx xxx - I/O insts - 8 with variations
1 xxx xxx xxx xxx xxx - reg to reg instructions - 8 with variations

Of course this is from the late 60's :-)
Post by Nudge
You'll notice I've used 3 bits instead of 2. But on average, an
instruction will take 1*0.5+2*0.3+3*0.2 = 1.5 bits instead of 2.
So you see, 24-bits does NOT mean 2^24 possible encodings.
And your figure of 65000 is nonsensical anyway.
--
Arargh311 at [drop the 'http://www.' from ->] http://www.arargh.com
BCET Basic Compiler Page: http://www.arargh.com/basic/index.html

To reply by email, remove the garbage from the reply address.
Loading...