Processor choice. Log Out | Topics | Search
Moderators | Register | Edit Profile

FlexPDE User's Forum » User Postings » Processor choice. « Previous Next »

Author Message
Top of pagePrevious messageNext messageBottom of page Link to this message

Archie Campbell (amc1)
Member
Username: amc1

Post Number: 5
Registered: 03-2008
Posted on Wednesday, September 03, 2008 - 02:11 pm:   

Dear Nelsons
I am getting a new computer and would like to know to what extent Flex PDE can make use of quad processors. Specifically a core 2 duo, 6Mb cache at 3.16 GHz would cost the same as a core 2 quad 2x4Mb cache at 2.4GHz. Which is likely to be fastest? If I go for the more expensive 3GHz quad will the increase in speed be in proportion to the clock speed?
thanks
Archie Campbell
Top of pagePrevious messageNext messageBottom of page Link to this message

Robert G. Nelson (rgnelson)
Moderator
Username: rgnelson

Post Number: 1165
Registered: 06-2003
Posted on Wednesday, September 03, 2008 - 03:20 pm:   

Predicting performance is difficult, because there are many factors that impact the speed of a run, not the least of which is the characteristics of the problem being solved.

As a rule of thumb, small problems will see processor clock rate, and large problems will see memory transfer speed.

FlexPDE version 5 supports only two processors, and then only in the conjugate-gradient solution phase. FlexPDE version 6, which should be released before the end of the year, is more fully multi-threaded, and will support four procesors.

Here are some numbers for our 3d_flowbox example (which favors multi-thread performance in version 5 because it spends a lot of time in the CG)

Version 5
2.2GHz dual-core Athlon, 333MHz 64-bit memory
1 thread = 9:16
2 threads = 7:42
2.4GHz quad-core Phenom, 666MHz 128-bit memory
1 thread = 8:15
2 threads = 6:09

The Phenom has made slightly better use of the second processor, presumably because the memory can keep up with the processors. The unthreaded parts of the v5 implementation prevent significant overall time improvement.

Version 6
2.4GHz quad-core Phenom, 666MHz 128-bit memory
1 thread = 8:06
2 threads = 4:14
3 threads = 3:30
4 threads = 3:22

On this machine, with (almost) fully threaded implementation, the second processor appears to be fully utilized, and the memory can keep up with two processors. The third processor is not fully utilized because the memory bandwidth is now becoming saturated. The fourth thread buys almost nothing, presumably because the memory speed is the controlling factor.

(The gains in version 6 shown by this test are not realized in very large problems, because less of the working memory can be held in cache.)

My inference from all this is that the higher clock rate might be a better bargain than the second pair of processors. But this is a guess, and it will depend on the actual memory speeds scaling in the same way as the processor clock.




Add Your Message Here
Post:
Username: Posting Information:
This is a private posting area. Only registered users and moderators may post messages here.
Password:
Options: Enable HTML code in message
Automatically activate URLs in message
Action:

Topics | Last Day | Last Week | Tree View | Search | Help/Instructions | Program Credits Administration