Hard reboots under load with SLI GTX260 - PSU?

Discussion in 'General Hardware' started by McBaresark, Jun 20, 2009.

  1. McBaresark

    McBaresark New Member

    Messages:
    6
    Likes Received:
    0
    GPU:
    2 x Galaxy GTX260+ 896MB
    Hello All,

    I need some advice from any PSU gurus out there. I've just put together a new system as follows:

    - DFI LanParty UT X58 T3eH8 (BIOS: 8th May 2009)
    - i7 920 stepping D0 @ 2.66GHz
    - 6GB Corsair CM3X2G1600C7 RAM (3 x 2GB, 533MHz 8-8-8-19)
    - Cooler Master Real Power M1000 PSU
    - 2 x Galaxy GTX260+ in SLI (625/1000/1350 clocks) - SLI bridge connected
    - WD Caviar Black 1TB SATA HD
    - LG BluRay/HDDVD SATA burner
    - Creative X-Fi XtremeGamer PCI
    - Cooler Master HAF 632 case (4 fans)
    - Vista x64 SP2

    No problems during idle (eg sitting at desktop, browsing via IE, etc.) But whenever I run a heavy 3D game like FarCry 2 or Crysis the game will run from anywhere between a few seconds to couple mins then the PC will hard reboot without warning. No BSOD, nothing in the event log, just a hard reboot.

    All components are running at stock speeds and voltages - no overclocking of anything. If I disable SLI in the NVidia control panel and run the games with only one card it seems fine. I've also run MemTest86+ for 8 hours (11 full passes) with no errors whatsoever so I don't think it's my RAM. Also, LinX completes a full test in about 9 mins with no errors, so I suspect the CPU is also not the culprit.

    If I try an older, less demanding game like Red Orchestra (Unreal 3 engine) it lasts longer before rebooting, maybe 10-15 mins. Also if I underclock the GPUs it will last longer depending on the degree of underclock. If I leave it idling at the desktop the system will stay up indefinitely (tested for 4 days so far) - so it definitely appears GPU related.

    I suspect I may have issues with my PSU no supplying enough current... does anyone have any advice on whether the CM M1000 is suitable for SLI GTX260s?

    Thanks!
     
  2. McBaresark

    McBaresark New Member

    Messages:
    6
    Likes Received:
    0
    GPU:
    2 x Galaxy GTX260+ 896MB
    This is my PSU. According to that review the 6 x 12V rails are connected as follows:

    12V1 - 18A - half of the EPS12V connector plus motherboard (CPU connectors maybe?)
    12V2 - 18A - half of the EPS12V connector plus 4-pin ATX12V connector
    12V3 - 28A - first blue PCIe modular connector (6 or 8 pin)
    12V4 - 28A - second blue PCIe modular connector (6 or 8 pin)
    12V5 - 18A - molex/SATA/FDD connectors
    12V6 - 18A - both green PCIe modular connectors (6 pin only)

    I've got my motherboard/CPU on 12V1 & 12V2 using the EPS12V and 8-pin CPU VRM power connectors, my DVD, HDD and 2 x extra FDD-type 12V/5V power connectors next to the PCIe slots for additional power to the mobo on 12V5. The two GPUs are both plugged into one blue (12V3 and 12V4 respectively) and one green (12V6 shared by both cards) connector.

    The Galaxy doco is very poor - it mentions that one GTX260 requires "min 36A on 12V rail" but doesn't list the current required for 2 in SLI config. Is it double - does it require 72A on the 12V rails? The PSU can output 80A total peak current on 12V but I suspect I may be overdrawing the specific rails the GPUs are plugged into.

    I tried moving one of the 2 6-pin PCIe connectors on the 2nd card from the green 12V6 rail to a double-molex-to-6-pin PCIe adaptor on 12V5 but it didn't help.

    Any suggestions?
     
  3. bp9801

    bp9801 Ancient Guru

    Messages:
    1,598
    Likes Received:
    0
    GPU:
    BFG 8800 GTX
    Well, the 260s in SLI probably only need about 50 to 60 amps on the 12v rails, just because there are two of them doesn't mean they double the amps. They use more yeah, but not a perfect double up of the amps. If you're running that kind of hardware on that power supply, odds are its either a faulty unit or not all the rails are operating at peak. Try running tests on the power supply itself, either with a tester or by removing all but the bare minimum of components. If it doesn't hold up with a smaller gain, I'd suggest to RMA it for the same unit or something better. An Enermax Galaxy EVO 1200W or the Revolution85+ 1050W should have enough amps on the 12v rails to sustain your system, especially with the SLI cards. Just because the unit is 1000W doesn't mean it can output that, and personally I'd trust Enermax over Cooler Master anyday. But thats just me.

    Also, with your unit, you probably are overloading the 12v6 rail because its having to pull double duty for both cards. It should be enough in theory, but its probably causing a small overload to make it unstable. I'd try for an RMA on it after you've tested the rails and make sure they're giving the supplied amps.
     
    Last edited: Jun 20, 2009
  4. McBaresark

    McBaresark New Member

    Messages:
    6
    Likes Received:
    0
    GPU:
    2 x Galaxy GTX260+ 896MB
    Thanks for the quick reply!

    I don't really have the equipment nor skills to test the PSU currents on the rails properly, and the only components I could remove to try to reduce the draw and still test are the sound card and case fans which I doubt add up to much. I might take your suggestion and try to RMA the unit and get a higher spec one. I can't find the Enermax PSUs here easily in Australia (unless they are branded something else here) - what about this:

    Corsair HX1000W

    Got 2 x 12V "true" rails with a 40A rating on each, and each is sharing a 500W max with the 5V and 3.3V rail respectively. Two of the PCIe connectors are available on each 12V rail. Only the 5V rail is lower rated than my current CM M1000 (30A vs 40A). More info here:

    http://www.hardocp.com/article.html?art=MTQ4NywyLCxoZW50aHVzaWFzdA==
    http://www.custompc.co.uk/labs/210684/corsair-hx1000w.html
    http://www.overclockersclub.com/reviews/corsair_hx1000w/2.htm

    Think that would do the trick with the SLI 260s?

    Thanks for your help!

    PS - I noticed that the Nvidia SLIZone site doesn't show the M1000 as certified for 2 x 260s, though it's listed for lower powered cards further down. But the Corsair HX1000 is certified...
     
    Last edited: Jun 20, 2009

  5. bp9801

    bp9801 Ancient Guru

    Messages:
    1,598
    Likes Received:
    0
    GPU:
    BFG 8800 GTX
    The Corsair HX line is generally one of the better lines there are, good clean power and consistent rails. It should keep your 260s in line no problem, though your current PSU should as well. I'm thinking there is a faulty rail or circuit somewhere in there, if you can test it out that would be the best. You could hit up some buddies and see if they have a PSU tester on hand or try to buy one from the store. Something like this Rexus unit will be able to show you all the relevant data for each connector and rail. Though, you'd have to shell out a bit more cash for it but it is a very handy tool to have on hand because you can test out any power supply you get or have and make sure it is in good condition. I think its a small price to pay so nothing damages your system and shorts out an expensive component.
     
  6. McBaresark

    McBaresark New Member

    Messages:
    6
    Likes Received:
    0
    GPU:
    2 x Galaxy GTX260+ 896MB
    Will a PSU tester like that one show the current or just the voltage?
     
  7. Makalu

    Makalu Ancient Guru

    Messages:
    4,196
    Likes Received:
    2
    GPU:
    EVGA 8800 Ultra
    This can vary but the info I have is a GTX260 draws ~7A thru one PCIe and ~4A thru the other and ~4A thru the slot. You can try blindly swapping PCIe connectors but even if you've got the two ~7A ones on 12V6 it's not coming close to that rails limits. 12V1 is possibly closer to it's limit depending on how DFI has implemented the 24-pin with the additional mobo molex on 12V5.

    Anyway I don't see a problem with the PSU and GTX260 SLI...it's a great unit...built like a tank and one of the best in it's class. Also certified for 8800Ultra SLI which draws more current. CM may not have paid Nvidia to have the older model tested with the newer cards.

    Don't buy the "PSU tester"...it'll tell you if the PSU turns on and voltages are in spec under a ~1A load. You can get a cheap DMM for the same price and measure voltages under full real system loads...real measure not an idiot light.

    I dunno...sounds like possibly temp related to me. Have you got the event handler enabled? Could be an AC problem...try another outlet...change your surge suppressor/UPS/AVR arrangement around or whatever you have.
     
  8. McBaresark

    McBaresark New Member

    Messages:
    6
    Likes Received:
    0
    GPU:
    2 x Galaxy GTX260+ 896MB
    Hi Makalu,
    I don't think it's a temp problem: I've been running GPU-Z logging to a text file during my testing and the highest GPU reading I've seen (the 1st "GPU Temperature" value in the Sensors section) is 51 degrees C at the point of the reboot. The rest of the GPU-Z sensors (2nd "GPU Temperature" and the 2 PCB readings) are way lower. Likewise my CPU & chipset are running quite cool - the case seems to do a good job of moving air around with its 4 fans and grills all over.

    I initially though it may be mobo voltage but I've tried raising the CPU VCORE, CPU VTT (Uncore) and VDIMM voltages as high as is specced by Intel but it makes no difference to the rebooting problem. The CPU and RAM seem solid, and the system is fine until I start stressing out the GPUs, and the less I stress them the longer it lasts before the reboot. To me that says it's either temp or current, and I've confirmed that the temps are fairly cool...

    Your figures are a total draw of 15A - that's a lot lower than the 36A Galaxy quote for each card. Are they just being conservative?

    And I'm not sure what you're referring to with this:

    "Have you got the event handler enabled?"

    Could you elaborate?

    Thanks!
     
  9. Makalu

    Makalu Ancient Guru

    Messages:
    4,196
    Likes Received:
    2
    GPU:
    EVGA 8800 Ultra
    36A is a recommended PSU 12V rating for powering the complete system and includes a good deal of overhead. Card alone is ~15A as measured here:

    http://ht4u.net/reviews/2009/power_consumption_graphics/index20.php

    actually closer to 14A.

    The event handler I'm talking about is the setting to display an error and write to an error log rather than automatically rebooting when there's a system error. It's in System in Control Panel but I think it's different in Vista...which I don't have. Maybe somebody here with Vista can point you in the right direction.
     
  10. McBaresark

    McBaresark New Member

    Messages:
    6
    Likes Received:
    0
    GPU:
    2 x Galaxy GTX260+ 896MB
    Ah - under Vista this is under: System --> Advanced System Settings --> Startup And Recovery --> System Failure --> Write Event To System Log

    And yes this was already enabled during my testing and it didn't produce anything at all - no eventlog entries nor a minidump file. Which tells me that no software component is failing (eg this isn't the Nvidia driver crashing nor any other kernel driver). It's like the power is just dropping out which causes the system to restart.

    I just tried underclocking the GPUs to the speeds shown in that German power test you linked: 575/999/1242. This certainly causes it to last longer before the reboot - I got through 3 loops of the "Small Ranch" benchmark on FarCry 2 with no crash, but once I actually try the game itself it reboots in about 5 mins, usually at a point of high graphical load (explosions, transparent flames and fast motion).

    Your point about AC supply possibly being the problem: I have the PC plugged into a powerstrip with inbuilt surge suppression, along with the following devices:

    - Samsung 19" 940BW monitor
    - NetComm Plus 4 ADSL modem/router
    - oldskool Hitachi analog amp (the kind you'd find in your father's hifi rack) plugged into a pair of old hifi speakers

    Are you thinking that maybe the whole draw on the AC/powerstrip may be too much when the GPUs fire up? There's an LED on the powerstrip that's supposed to change colour if the circuit breaker trips and it isn't switching when the reboot happens. Also, the amp and monitor don't seem to be affected as it reboots.

    It seems to me that whatever it is I'm running at very near a threshold somewhere, and when system load pulls it over BANG. Question is - where/what is that theshold? Any idea which of the two 6-pin connectors on the GTX260s is the 7A and which is the 4A?

    Cheers!

    PS - I tried switching the old amp off and retesting, but it rebooted in about the same time/place as before.
     
    Last edited: Jun 21, 2009

  11. Makalu

    Makalu Ancient Guru

    Messages:
    4,196
    Likes Received:
    2
    GPU:
    EVGA 8800 Ultra
    Naw I don't think it's exceeding any limits on the surge suppressor but it just may be flaking out so try it without the surge suppressor and perhaps try it plugged into a wall outlet that's on a different house circuit if possible but I doubt that'll change anything.

    No I have no idea which connector is which but that's one of the things that may differ from brand to brand...the total would be the same but perhaps distributed differently but it shouldn't be over 7A on any one since the spec calls for max 6.25A on a 6-pin PCIe.
     
  12. souravbasak911

    souravbasak911 Guest

    Messages:
    1
    Likes Received:
    0
    GPU:
    Nvidia Geforce 9500 GT
    Hello McBaresark,
    i am having exactly the same type of problem...........and it only started a few days ago........what was your solution?? Plz reply soon......thanks in advance
     

Share This Page