SUMMARY: how is your fp performance?

From: <K.McManus_at_greenwich.ac.uk>
Date: Fri, 30 May 1997 16:57:31 +0100 (BST)

Hello and thanks to all responses.

The answer lies in loop unrolling a la KAP. Nick Hill at
Rutherford Labs has supplied me with a KAP version of the
SAXPY that produces a fabulous 960MFLOPS. I have left this
code for you all to try at...
http://www.gre.ac.uk/~k.mcmanus/saxpy.kap.f
and the original
http://www.gre.ac.uk/~k.mcmanus/saxpy.f
is still there for comparison.
A point of interest is that the KAP version doubles the FLOP
rate on Sun machines.

This raises some intriguing questions.....

1 How come the FLOP rate is more than twice the clock rate
        of 466MHz??

2 Why can the compiler not manage this elementary transform??

3 Is this a conspiracy to raise royalty for Kuck??

4 Has anybody compared this rather poor compiler performance
        against the impressive SGI v6 compiler??

5 Why did DEC not tell me before buying Half a million
        bucks of kit that without KAP it would run like a three
        legged dog??

Answers on an email please to

k.mcmanus_at_gre.ac.uk - http://www.gre.ac.uk/~k.mcmanus
-------------------------------------------------------------
Dr Kevin McManus ||
School of Computing & Math Science ||
The University of Greenwich ||
Wellington St. Woolwich ||Tel +44 (0)181 331 8719
London SE18 6PF UK ||Fax +44 (0)181 331 8665
Received on Fri May 30 1997 - 18:09:24 NZST

This archive was generated by hypermail 2.4.0 : Wed Nov 08 2023 - 11:53:36 NZDT