extremely slow 2D performance

  ymatioun

    ymatioun

    Sapphire Radeon HD 6870
    My computer is experiencing extremely slow 2D performance. This problem manifests itself in very low frame rates on DVD playback, on playback of any videos, and on anything else 2D.

    Here are some numbers in support of this statement - results of 2D benchmark from Passmark:

    PassMark(TM) PerformanceTest 7.0 Evaluation Version

    Benchmark Results

    Test Name: This Computer
    Graphics 2D - Solid Vectors: 0.3
    Graphics 2D - Transparent Vectors: 0.1
    Graphics 2D - Complex Vectors: 138.6
    Graphics 2D - Fonts and Text: 16.4
    Graphics 2D - Windows Interface: 15.5
    Graphics 2D - Image Filters: 21.1
    Graphics 2D - Image Rendering: 35.7
    2D Graphics Mark: 82.8

    Test Name: generic Radeon HD 6870
    Graphics 2D - Solid Vectors: 1.71
    Graphics 2D - Transparent Vectors: 1.72
    Graphics 2D - Complex Vectors: 141.2
    Graphics 2D - Fonts and Text: 235.5
    Graphics 2D - Windows Interface: 111.0
    Graphics 2D - Image Filters: 316.5
    Graphics 2D - Image Rendering: 589.3
    2D Graphics Mark: 411.5

    System information: This Computer
    CPU Manufacturer: GenuineIntel
    Number of CPU: 1
    Cores per CPU: 6
    CPU Type: Intel Core i7 970 @ 3.20GHz
    CPU Speed: 3328.0 MHz
    Cache size: 256KB
    O/S: Windows XP Professional (64-bit)
    Total RAM: 6133.8 MB.
    Available RAM: 4079.1 MB.
    Video settings: 1920x1080x32
    Video driver:
    DESCRIPTION: AMD Radeon HD 6870
    MANUFACTURER: Advanced Micro Devices, Inc.
    BIOS: 113-E17700SA-S44
    DATE: 12-5-2011
    Most of the results are >10 times slower than results from generic machine with the same video card. Image rendering is 16 times slower!

    I traced this problem to slow Blt() and BltFast() calls. It appears that each call to Blt() or BltFast() incurs time penalty of approximately 90 micro seconds; after this penalty performace proceeds with normal - fast - speed. As a result, moves of small blocks of memory are extremely slow, while moves of large blocks are fast.

    I wrote a simple program that calls Blt() in a loop; when this program is running, CPU utilization is 100% (on 1 core only). I ran AMD CodeAnalyst profiler to see where all the CPU time is being spent; virtually all of it is spend in module ati2cqag.dll - Central Memory Manager / Queue Server Module, part of ATI video drivers. Specifically, most of the time is spent in one small (only 52 assembler instructions) function. Here is that function from drivers included in 12.1 Catalyst package (obtained using DUMPBIN):

    0000000000036C60: 48 83 EC 28 sub rsp,28h
    0000000000036C64: 48 8B 41 08 mov rax,qword ptr [rcx+8]
    0000000000036C68: 48 89 5C 24 30 mov qword ptr [rsp+30h],rbx
    0000000000036C6D: 48 89 6C 24 38 mov qword ptr [rsp+38h],rbp
    0000000000036C72: 8B 69 48 mov ebp,dword ptr [rcx+48h]
    0000000000036C75: 48 89 74 24 40 mov qword ptr [rsp+40h],rsi
    0000000000036C7A: 48 8B B0 98 030000 mov rsi,qword ptr [rax+00000398h]
    0000000000036C81: 48 8B 06 mov rax,qword ptr [rsi]
    0000000000036C84: 48 8B D9 mov rbx,rcx
    0000000000036C87: 8B D5 mov edx,ebp
    0000000000036C89: 48 8B CE mov rcx,rsi
    0000000000036C8C: 48 89 7C 24 48 mov qword ptr [rsp+48h],rdi
    0000000000036C91: FF 90 C8 00 00 00 call qword ptr [rax+000000C8h]
    0000000000036C97: 48 85 C0 test rax,rax
    0000000000036C9A: 48 8B F8 mov rdi,rax
    0000000000036C9D: 74 2B je 0000000000036CCA
    0000000000036C9F: 48 8B 80 B8 000000 mov rax,qword ptr [rax+000000B8h]
    0000000000036CA6: 4C 8B 00 mov r8,qword ptr [rax]
    0000000000036CA9: 4C 3B 43 40 cmp r8,qword ptr [rbx+40h]
    0000000000036CAD: 7D 4B jge 0000000000036CFA
    0000000000036CAF: 48 8B 06 mov rax,qword ptr [rsi]
    0000000000036CB2: 8B D5 mov edx,ebp
    0000000000036CB4: 48 8B CE mov rcx,rsi
    0000000000036CB7: FF 50 50 call qword ptr [rax+50h]
    0000000000036CBA: 4C 8B 9F B8 000000 mov r11,qword ptr [rdi+000000B8h]
    0000000000036CC1: 49 8B 03 mov rax,qword ptr [r11]
    0000000000036CC4: 48 3B 43 40 cmp rax,qword ptr [rbx+40h]
    0000000000036CC8: 7D 30 jge 0000000000036CFA
    0000000000036CCA: 48 83 7B 20 00 cmp qword ptr [rbx+20h],0
    0000000000036CCF: 48 8D 4B 20 lea rcx,[rbx+20h]
    0000000000036CD3: 75 06 jne 0000000000036CDB
    0000000000036CD5: FF 15 2D 03 05 00 call qword ptr [00087008h]
    0000000000036CDB: 8B 43 34 mov eax,dword ptr [rbx+34h]
    0000000000036CDE: 85 C0 test eax,eax
    0000000000036CE0: 75 0F jne 0000000000036CF1
    0000000000036CE2: 38 43 38 cmp byte ptr [rbx+38h],al
    0000000000036CE5: 75 13 jne 0000000000036CFA
    0000000000036CE7: 8B 43 30 mov eax,dword ptr [rbx+30h]
    0000000000036CEA: 89 43 34 mov dword ptr [rbx+34h],eax
    0000000000036CED: 32 C0 xor al,al
    0000000000036CEF: EB 0B jmp 0000000000036CFC
    0000000000036CF1: FF C8 dec eax
    0000000000036CF3: 89 43 34 mov dword ptr [rbx+34h],eax
    0000000000036CF6: 32 C0 xor al,al
    0000000000036CF8: EB 02 jmp 0000000000036CFC
    0000000000036CFA: B0 01 mov al,1
    0000000000036CFC: 48 8B 7C 24 48 mov rdi,qword ptr [rsp+48h]
    0000000000036D01: 48 8B 74 24 40 mov rsi,qword ptr [rsp+40h]
    0000000000036D06: 48 8B 6C 24 38 mov rbp,qword ptr [rsp+38h]
    0000000000036D0B: 48 8B 5C 24 30 mov rbx,qword ptr [rsp+30h]
    0000000000036D10: 48 83 C4 28 add rsp,28h
    0000000000036D14: C3 ret

    This problem started with driver version 12.3, i upgraded to 12.4 - no help. I downgraded to 12.1 - also no help.

    It appears that part of video hardware responsible for blitting is in power saving mode, and it takes 90 micro seconds to wake up for each Blt() call. Perhaps where is a way to fix this in some sort of power saving configuration.

    By the way, 3D performace is fine; the problem is limited to slow Blt() calls, which apparently don't happen too often in 3D mode.

    Any suggestions would be appreciated. Thanks in advance.
  jayman

    jayman

    PowerColor AX7990 x 2 3GB
    Do you have the latest codec pack installed, important for playing video. You would be better off using windows 7 then XP for a number of reasons.
  teleguy

    teleguy

    GTX 1070/Vega 56
  kn00tcn

    kn00tcn

    570m / MSI 660 Gaming OC
    in XP, on my x800xtpe, in some cases i might end up with really slow 2d drawing of animations of windows.... usually that gets fixed by opening & closing a fullscreen game, at least back in the omega 7.x days

    another thing to check is if your DPC latency is ok

  ymatioun

    ymatioun

    Sapphire Radeon HD 6870
    checking DPC latency sounds like a good idea. I just checked it - it is around 200 microseconds. Seems within normal range.

    Also, hardware acceleration is on. Turning it off and then back on did not help.
  kn00tcn

    kn00tcn

    570m / MSI 660 Gaming OC
    wow 200, i idle at like 45

    though that's probably not messing with 2d rendering
  robi

    robi

    ATI 7750
    problem solved on my PC


    I tested my PC on PassMark software and suffered 10times lower performance of 2D graphic then is normal. 3D graphic performance was OK.

    My PC:

    CPU AMD Phenom2 3,1 GHz
    Graphics AMD ATI HD 7750 series
    Memory 2x4 GB Kingston 1333 MHz CL9
    OS: win7 64bit Ultimate

    The problem is that ATI - AMD somehow screwed up their last drivers. I read more toppics about it and it could appear on some combinations of HW and drivers only.

    At first I checked results of memory benchark and CPU benchmark, because CPU and memory performance have direct impact on 2D graphics results. My results were both OK.

    Then I read some articles about modern graphic cards - seems they don't have any HW accelerators for 2D graphics, so they use normal PC RAM . Some drivers - HW combination - bios settings may cause extreme slowness.

    So I uninstalled all graphic drivers - nothing happened.
    So I uninstalled completely all drivers on my PC including motherboard and reinstall it to the oldest versions possible.

    After that 2D graphic performance raised 10 times to normal values.

    Be sure to check your BIOS settings related to memory timings and power save features as well. Don't hesitate disable all advanced features during tests and try all combination of them. Some combinations of HW - BIOS settings - drivers may cause really extreme difference in benchmark results.

    Simply test it all. I wish you good luck. Robi.
  Ripshod

    Ripshod

    HIS HD7950 FAN 3GB
  aireca

    aireca

    HD5770-1024MB HD4850-512
