Discussion in 'Folding@Home - Join Team Guru3D !' started by Svein_Skogen, Aug 14, 2010.
your going to be able to put up some serious number soon :thumbup:
Wow nice, I better pass you while I have the chance! Although, I don't think I can hold you off with that setup now
Been trying to optimize folding performance with the two 580s, single 480, and SMP2 client... actually drove into the office on my day off to try to optimize things and will likely be doing the same tomorrow. I'm having trouble keeping the TPF on the SMP2 low enough to be able to meet bigadv project deadlines. I had it down to 3 hrs under the deadline last evening but that is too close for comfort (I do want to play an occaissional game). For now I have focused in on the following settings. I now need to wait a couple of hours before I can ascertain the new TPF on the SMP2 client:
580 SLI disabled (per nVidia Control Panel)
480 dedicated to PhysX (setting in nVidia Control Panel)
SMP2 Client folding on 6 threads
SMP2 Client priority set to "low"
I've been experimenting with the "idle" priority in the SMP2 client's advanced settings and might need to use that setting in conjunction with the client using 7 threads and the exclusion of one of the GPU clients.
It seems the advanced research I had posted in regards to thread count versus TPF is invalid since the addition of GPUs incurrs additional processor overhead beyond the minimum of what "should" be needed to keep the percent usage of each GPU high.
I can see where this is going now... I'm going to need to start saving for a six-core 980x. I'm hoping that prices on Core i7 processors will drop over the next couple of months with the introduction of Sandy Bridge. That is probably wishfull thinking with the 980x (or 970) though.
what tpf are you getting with 6 threads jjb?
Seeing a TPF of 54 minutes with the 3 GPUs running. That only leaves about 6 hours to the deadline. The predicted credit (HFM.NET) is 57,226 points. Before, when running only 1 GPU client and the SMP2 on 7 threads, the TPF was 34 minutes and the predicted credit about 72,000 points.
Time for another test... overnight I'm going to try SMP2 on 7 threads with SMP2 priority set to "idle", and eliminate one GPU (the GTX-480). Will see what the TPF is in the morning.
The way I see it, it's about trying to improve the TPF of the SMP2 client so I can occaisionally shut it down to do other things, while not missing the deadline in the end, plus maximizing point production during the time that remains. I may be able to make up for points lost due to shutdown of a GPU client by taking a lesser point hit in the SMP2 client.
What I have been neglecting to mention is that there also needs to be enough CPU resources left over such that the virtual machine I run for business purposes must not cause such a definiciency in CPU power that the GPU clients stop folding or are significantly impaired. A possible problem with setting the SMP2 client priority to "idle" is that IT will become what is throttled down, instead of the GPU clients, when the virtual machine is running.
IIRC, with a TPF of ~54 minutes, the PPD would be ~16K but if you fold normal SMP2 WUs, my guess is that your PPD would be ~15K. The difference is of only 1K so I would suggest that you can fallback to normal WUs since thy give you roughly the same amount of PPD and you don't have to worry about the deadlines too much. My guess is that the highest PPD (without causing a headache for you) can be 3 GPUs with 1 normal SMP2 (-smp 6) and you can work easily on your system
That sounds like a plan PatherX. The overnight test yielded a TPF of 56 minutes - even worse than before. I've done quite a bit of testing now and do not believe it will be possible to fold on more than one GPU while completing bigadv projects within the 4-day deadline, in a reliably fashion, on this 4-core CPU - even if it is running at 4.1 GHz. I suspect a 6-core CPU is a requirement for bigadv when folding on more than one GPU.
For the time being I have halted all GPU clients, set the SMP2 priority back to "low", increased the SMP2 threads to 8, and added the "oneunit" flag. This should allow the current WU to complete well within the deadline. CPU usage is at 97% and the TPF is currently 44 minutes. Hmm... doesn't make sense, does it? I think something was wrong. I have rebooted the computer and will check the TPF again in a little bit. I'm now wondering if something was causing the TPF to cap out.
<Addition> TPF was up to 41min after reboot. I read about the SMP2 beta client not scaling well with even numbered SMP flags so I'm trying again on 7 threads now, with no GPU clients running. If the TPF is not on the order of 34min, as it was before the addition of the GTX-580 cards, then my only conclusion is that something about my system has changed that is impacting the TPF in a very negative way. Apart from installing the GTX-580 driver, Metro 2033 via Steam, and the Steam client itself there have been no other changes.
To be clear and ensure I understand what I should be doing to back away from bigadv work units... I will need to replace the "bigadv" flag with the "advmethods" flag. Is this correct?
Yon can add -advmethods flag which gives you access to late-stage beta WUs. If you run without the -advmethods flag, you will be assigned normal WUs.
Instead of -smp 8 or -smp have you tried -smp 7 so you can finish the current WU? The reason is that any CPU cycles "stolen" from FahCore_a3 will cause a non-linear slow down in the WU processing.
Yes (-SMP 7)... just got to the office and the TPF was about 45 minutes. Something in my system has changed software/driver wise that is no longer permitting even a half-way decent TPF on this quad-core... and this was with no GPUs running. So much for my comment regarding a 6-core CPU perhaps being necessary. Kind of wierd... I was even contemplating going back to running only one GPU with the SMP2 client at SMP 7, as before, because (let's face it) the points/benefit-versus-wattage ratio is killed by a couple of high-end GPUs, but tests are showing there is no going back. If I were to blame something I might say the GTX-580 driver is rersponsible, but I say that without enough proof to back anything up at this point. I'm certainly not going to uninstall the GTX-580 drivers. I'll just have to wait for new drivers to come out, upgrade, and try testing again later on.
With all the stops, restarts, and experimentation there is no longer any chance of completing the current bigadv work unit so I have deleted it. I'll try running normal work units for now, unless you think there is any point advantage to folding those late-stage beta work units.
Thanks for the feedback!
hey jj i had to do a lot of research into what gpu3 was doing on my cpu, with process lasso you can set gpu3 to one core/thread but with two gpu3s they will both go to that core/thread now with 3 gpu3s unless you can find a way to force them to use different threads/core windows mite do it, and it mite pic a core/thread that the smp is using
now the other big thing i found was that when running two gpu3 clients it would some time use more than one core even though the core was not at 1005 usage confused yet lol?
well what i did was set gpu3 to core/thread 12 but like i just said up there it was also useing core/thread 7 but core/thread11 was not being 1005 used so i set all my gpu3 client to use core/thread 7 and all other core/thread to smp11 so that meant that smp was using 0,1,2,3,4,5,6,8,9,10,11,12 and that the two gpu3 clients were using core/thread 7
so what im am saying is that running 3 gpu3 client sound very difficult and one a quad core with 8 thread will take some planning BUT is doable
hope this help:nerd:
and i hope you get what i am saying
I currently have the SMP2 client folding a normal WU with -SMP 6 and I'm seeing a production rate of 6870 PPD (TPF = 10min 49sec). I have the -oneunit flag set so I can try running a late-stage beta WU via the advmethods flag tomorrow to see what PPD I can acquire with those work units. The SMP2 client with -SMP 6 does seem to be the ticket at the moment since I can run my virtual machine and keep the three GPU3 clients nearly fully utilitizated. I've found that SLI on the two GTX-580s must be disabled and the GTX-480 must be "dedicated" to PhysX in the nVidia console in order to assure maximum usage of all three GPUs for folding. These actions should not be necessary because the "CUDA - GPUs" option in the nVidia console was meant to avoid this, as well as the previous need to connect a monitor to each GPU. Well, at least that last need is no longer required.
Regarding your comments, I think you are referencing the following utility. Is this correct?
If I read you correctly, you used Process Lasso to prioritize the usage of core/thread 7 by your two GPU3 clients after discovering that they were predominantly using core/thread 11 plus a little bit of 7 (even though 11 was not fully utilitzed). You did this because Windows seemed to want to use core/thread 7 for the GPU3 clients... so why not force it. You then had Process Lasso prioritize the SMP2 client (with -SMP 11 flag) to use the other 11 cores/threads.
Did I get that right? Is it Process Lasso that is "reigning" in the clients to use specific cores/threads and not some command-line options (flags) on the clients?
yep you got it in one, it just funny behaviour ive been seeing when running gpu3 x 2 clients
it was slowing my smp/bigadv down my cpu is normal utilizing 95-98% total i think i had to go 4ghz plus to get the gpu utilizing 99% any lower like 3.8 and one of the two would only be working 88%
yep Process Lasso only the normal flags like smp11 and so on
Cool - thanks for the Process Lasso tips! I'll look into it next week when I have some time.
Of course, I still have another problem in the works since my SMP2 production rate has gone way down with the install of these GTX-580 cards... even with no GPU3 clients running, a fresh reboot, and all extraneous services terminated.
My SMP2 rate went from about 6700 to 9700 PPD in switching from normal to late-stage beta work units (-advmethods flag) so I am sticking with those for now. This is with -SMP 6. The three GPU3 clients are running at 100% and only drop temporarily in usage when my virtual machine boots up. GPU usage returns to 100% once the VM has finished loading and has stabilized. So, I'm happy for now and system will be folding without interruption through the weekend.
Total PPD is about 62500 with current work units.
I am still baffled by the drop in SMP2 performance as a result of merely installing GTX-580 drivers. I should not have had to take a 17000 point cut on my SMP2 performance (bigadv) without any GPU3 clients running. It will be interesting to see if the new drivers that nVidia is said to be releasing in early January will improve the situation. I currently have two separate sets of drivers installed... 260.xx for the GTX-480 and 263.xx for the GTX-580s. I believe the upcoming driver release will support both cards so perhaps things will get better.
The only other reason I can think of for the slow down, crazy or not, is some slow-down related to the populating of two more PCIe slots.
Currently running with the case's side cover off so CPU core temperatures top out at 74 C rather than 80 C. The GPU cooling loop is heating things up inside the case. I think I need to add a fan or two to the side panel for increased air flow, something the 800D case is not known for.
LOL - If I asked you to guess what is causing the most noise in this system you would likely guess incorrectly. It's the BFG EX1200 PSU. I'm drawing about 950 watts from that PSU at the moment and it only takes a few minutes for its fan to whir up high.
here are the new/Leaked driver 263.14
is far as folding goes even pcie x4 wont make a differents to folding so i don't think it would be that but remmeber bigadv should be run on a 8 core/thread cpu but with an overclock you can use smp7, smp6 mite just be to little hourse power at what ever clock speed you give it
sound like your pumping out somee ppd now :banana:
Wow... thanks for the find Iancook! I've downloaded the driver and it does indeed unify all nVidia cards, including the 580 and 480. I'll rip out the old drivers on Monday, install this new one, and then monitor the SMP2 TPF for improvements. If the TPF improves then I'll try pulling down a bigadv work unit again.
I'm currently seeing a TPF of 4min 17sec on P6077.
Strange, on my setup (-smp 7) with 1 GTX 260/216 for P6077:
Min. Time / Frame : 00:03:21 - 15,216.22 PPD
Avg. Time / Frame : 00:03:26 - 14,665.61 PPD
If you have some free time, I suggest that you look a little deeper into what is causing this slow-down.
480 dedicated to PhysX
that's so much of an overkill huh?
Definately, but I decided to leave it in the system for folding.
That is what I mean. It's as though the install of the two 580s trimmed my TPF values on the SMP2 client - even with the GPU3 clients are taken out of the picture and shut down. This unexpected drop in TPF resulted in my having to exclude bigadv projects for the time being.
I noticed that even though the 480 and 580 use different official drivers, it appears as though all three cards are using the 580 drivers at the moment. Perhaps this might be a cause but it still doesn't sit right with me blaming video drivers for slowing down something that isn't related to video in any way (the SMP2 client). The install of the two 580 cards, their driver, the install of the Steam client, and the install of Metro 2033 are the only changes I made to the system that could have caused the slowdown in the TPF.
What I'll try doing tomorrow is I'll uninstall both sets of drivers and then re-install the driver for the 480 only. I'll then use the 480 only (leave the 580s disabled) and check the SMP2 TPF again to see if has increased. This will be an indicator that the 580 driver, or perhaps it use by the 480 when install the 580 driver over the 480 driver, is the cause.
I'll report back when I have some results.