1. This site uses cookies. By continuing to use this site, you are agreeing to our use of cookies. Learn More.

A bit detailed info about memory compression in Win10

Discussion in 'Operating Systems' started by mbk1969, Jan 29, 2018.

  1. mbk1969

    mbk1969 Ancient Guru

    Messages:
    7,886
    Likes Received:
    4,521
    GPU:
    GeForce GTX 1070
    From "Windows Internals" Seventh Edition:

    Memory compression
    The Windows 10 memory manager implements a mechanism that compresses private and page-filebacked section pages that are on the modified page list. The primary candidates for compression are private pages belonging to UWP apps because compression works very well with the working set swapping and emptying that already occurs for such applications if memory is tight. After an application is suspended and its working set is outswapped, the working set can be emptied at any time and dirty pages can be compressed. This will create additional available memory that may be enough to hold another application in memory without making the first application’s pages leave memory.

    Note Experiments have shown that pages compress to around 30–50 percent of their original size using Microsoft’s Xpress algorithm, which balances speed with size, thus resulting in considerable memory savings.

    The memory compression architecture must adhere to the following requirements:
    • A page cannot be in memory in a compressed and an uncompressed form because this would waste physical memory due to duplication. This means that whenever a page is compressed, it must become a free page after successful compression.
    • The compression store must maintain its data structures and store the compressed data such that it is always saving memory for the system overall. This means that if a page doesn’t compress well enough, it will not be added to the store.
    • Compressed pages must appear as available memory (because they can really be repurposed if needed) to avoid creating a perception issue that compressing memory somehow increases memory consumption.
    Memory compression is enabled by default on client SKUs (phone, PC, Xbox, and so on). Server SKUs do not currently use memory compression, but that is likely to change in future server versions.
    During system startup, the Superfetch service (sysmain.dll, hosted in a svchost.exe instance) instructs the Store Manager in the executive through a call to NtSetSystemInformation to create a single system store (always the first store to be created), to be used by non-UWP applications. Upon app startup, each UWP application communicates with the Superfetch service and requests the creation of a store for itself.

    Compression illustration
    To get a sense of how memory compression works, let’s look at an illustrative example. Assume that at some point in time, the following physical pages exist:

    Zero/Free: {1} <=> {2} <=> {3} <=> {4} <=> {5} <=> {6} <=> {7} <=> {8}
    Active: {9} {10}
    Modified: {11} <=> {12} <=> {13} <=> {14} <=> {15} <=> {16}

    The zero and free page lists contain pages that have garbage and zeroes, respectively, and can be used to satisfy memory commits; for the sake of this discussion, we’ll treat them as one list. The active pages belong to various processes, while the modified pages have dirty data that has not yet been written to a page file, but can be soft-faulted without an I/O operation to a process working set if that process references a modified page.
    Now assume the memory manager decides to trim the modified page list—for example, because it has become too large or the zero/free pages have become too small. Assume three pages are to be removed from the modified list. The memory manager compresses their contents into a single page (taken from the zero/free list):

    Zero/Free: {1} <=> {2} <=> {3} <=> {4} <=> {5} <=> {6} <=> {7} <=> {8}
    Active: {9} {10}
    Modified: {11} <=> {12} <=> {13} <=> {14} <=> {15} <=> {16}

    Pages {11}, {12} and {13} are compressed into page {1}. After that’s done, page {1} is no longer free and is in fact active, part of the working set of the memory compression process (described in the next section).
    Pages {11}, {12}, and {13} are no longer needed and move to the free list; the compression saved two pages:

    Zero/Free: {2} <=> {3} <=> {4} <=> {5} <=> {6} <=> {7} <=> {8} <=> {11} <=> {12} <=> {13}
    Active: {9} {10}
    Active (Memory Compression Process): {1}
    Modified: {14} <=> {15} <=> {16}

    Suppose the same process repeats. This time, pages {14}, {15}, and {16} are compressed into (say) two pages ({2} and {3}) as shown here:

    Zero/Free: {2} <=> {3} <=> {4} <=> {5} <=> {6} <=> {7} <=> {8} <=> {11} <=> {12} <=> {13}
    Active: {9} {10}
    Active (Memory Compression Process): {1}
    Modified: {14} <=> {15} <=> {16}

    The result is that pages {2} and {3} join the working set of the memory compression process, while pages {14}, {15}, and {16} become free:

    Zero/Free: {4} <=> {5} <=> {6} <=> {7} <=> {8} <=> {11} <=> {12} <=> {13} <=> {14} <=> {15} <=> {16}
    Active: {9} {10}
    Active (Memory Compression Process): {1} {2} {3}
    Modified:

    Suppose the memory manager later decides to trim the working set of the memory compression process. In that case, such pages are moved to the modified list because they contain data not yet written to a page file. Of course, they can at any time be soft-faulted back into their original process (decompressing in the process by using free pages). The following shows pages {1} and {2} being removed from the active pages of the memory compression process and moved to the modified list:

    Zero/Free: {4} <=> {5} <=> {6} <=> {7} <=> {8} <=> {11} <=> {12} <=> {13} <=> {14} <=> {15} <=> {16}
    Active: {9} {10}
    Active (Memory Compression Process): {3}
    Modified: {1} <=> {2}

    If memory becomes tight, the memory manager may decide to write the compressed modified pages {1} and {2} to a page file.
    Finally, after such pages have been written to a page file, they move to the standby list because their content is saved, so they can be repurposed if necessary. They can also be soft-faulted (as when they are part of the modified list) by decompressing them and moving the resulting pages to the active state under the relevant process working set.

    Compression architecture
    The compression engine needs a “working area” memory to store compressed pages and the data structures that manage them. In Windows 10 versions prior to 1607, the user address space of the System process was used. Starting with Windows 10 Version 1607, a new dedicated process called Memory Compression is used instead. One reason for creating this new process was that the System process memory consumption looked high to a casual observer, which implied the system was consuming a lot of memory. That was not the case, however, because compressed memory does not count against the commit limit. Nevertheless, sometimes perception is everything.

    The Memory Compression process is a minimal process, which means it does not load any DLLs. Rather, it just provides an address space to work with. It’s not running any executable image either — the kernel is just using its user mode address space.

    For each store, the Store Manager allocates memory in regions with a configurable region size. Currently, the size used is 128 KB. The allocations are done by normal VirtualAlloc calls as needed. The actual compressed pages are stored in 16 byte chunks within a region. Naturally, a compressed page (4 KB) can span many chunks.

    Pages are managed with a B+Tree — essentially a tree where a node can have any number of children — where each page entry points to its compressed content within one of the regions. A store starts with zero regions, and regions are allocated and deallocated as needed. Regions are also associated with priorities.

    Adding a page involves the following major steps:
    1. If there is no current region with the page’s priority, allocate a new region, lock it in physical memory, and assign it the priority of the page to be added. Set the current region for that priority to the allocated region.
    2. Compress the page and store it in the region, rounding up to the granularity unit (16 bytes). For example, if a page compresses to 687 bytes, it consumes 43 16-byte units (always rounding up). Compression is done on the current thread, with low CPU priority (7) to minimize interference. When decompression is needed, it’s performed in parallel using all available processors.
    3. Update the page and region information in the Page and Region B+Trees.
    4. If the remaining space in the current region is not large enough to store the compressed page, a new region is allocated (with the same page priority) and set as the current region for that priority.

    Removing a page from the store involves the following steps:
    1. Find the page entry in the Page B+Tree and the region entry in the Region B+Tree.
    2. Remove the entries and update the space used in the region.
    3. If the region becomes empty, deallocate the region.

    Regions become fragmented over time as compressed pages are added and removed. The memory for a region is not freed until the region is completely empty. This means some kind of compaction is necessary to reduce memory waste. A compaction operation is lazily scheduled with aggressiveness depending on the amount of fragmentation. Region priorities are taken into account when consolidating regions.
     
    yobooh, jura11, Ant1Cheater and 3 others like this.
  2. mbk1969

    mbk1969 Ancient Guru

    Messages:
    7,886
    Likes Received:
    4,521
    GPU:
    GeForce GTX 1070
    Memory compression can be observed/disabled/enabled in elevated PowerShell:

    Get-MMAgent
    Disable-MMAgent -MemoryCompression
    Enable-MMAgent -MemoryCompression
     
    Last edited: Jan 30, 2018
    yobooh, jura11, akbaar and 2 others like this.
  3. EdKiefer

    EdKiefer Ancient Guru

    Messages:
    2,310
    Likes Received:
    184
    GPU:
    MSI 970 Gaming 4G
    Nice write up on memory management sections, your working overtime :)
     
    mbk1969 likes this.
  4. jaggerwild

    jaggerwild Master Guru

    Messages:
    687
    Likes Received:
    242
    GPU:
    Many
    WOW, my brain hurts!
     

  5. AsiJu

    AsiJu Ancient Guru

    Messages:
    5,797
    Likes Received:
    1,256
    GPU:
    MSI RTX 2070 Armor
    Yes very insightful, thank you! Bookmarked this thread and the other two you made (about memory combining and superfetch).

    mbk what's your personal take; can disabling memory compression boost performance in any scenario?
    Sounds like it should be kept on as it compresses "out-of-date" content in layman's terms if I got it right (suspended app's pages are compressed and working set emptied).
     
  6. mbk1969

    mbk1969 Ancient Guru

    Messages:
    7,886
    Likes Received:
    4,521
    GPU:
    GeForce GTX 1070
    It all comes to the amount of RAM. This compression can eliminate the amount of pages swapped into pagefile (eliminate expensive hard faults). But if you have enough RAM and you don`t bloat current session with hundreds of processes (tabs in browser for example) you won`t have massive swapping. So if you have 4GB of RAM (and lazy snail HDD with 5400 rpm) then memory compression could give noticeable boost. Especially if you run many UWP apps on background. But if you have 16GB and close browser before starting a game, and do not use many UWP apps then I assume you won`t notice boost from compression, but you can get unpredictable/unregular hiccups due to compression background activities. If you have 8GB then you are between two lands.

    So - as usual - only tests on rigs can give actual answers.

    I have 16GB, I don`t launch many programs, and don`t have tabs in browser, and don`t use any UWP apps - so I disabled memory compression and memory combining.

    PS Had a thought: MS could actually use a triggered activation of memory compression like when too low free/standby memory.
     
    Last edited: Feb 6, 2018
    Jackalito, HK-1 and AsiJu like this.
  7. AsiJu

    AsiJu Ancient Guru

    Messages:
    5,797
    Likes Received:
    1,256
    GPU:
    MSI RTX 2070 Armor
    Thanks! I had a similar gist of it myself.

    Keeping it on for the time being but I may experiment / benchmark with it off as well.

    16 GB RAM and I keep my system as clean as possible. Also if a game is running nothing else is running etc.

    Browserwise think my personal best is 4 or 5 tabs open simultaneously and even that was confusing af... :p
     
    mbk1969 likes this.
  8. A2Razor

    A2Razor Master Guru

    Messages:
    477
    Likes Received:
    50
    GPU:
    Vega FE Liquid
    My jist is: Windows implementation is just like ZRAM (in concept at least) in the unix world (2010). --infact, probably created in response to it

    -Intended to be inactive unless the system were to commit to swap space, and in that case used as a higher target priority than committing to plain old swap as compression is assumed to always be faster than accessing any disk.


    EDIT: On shutting off compression. You'll find that enabling compression will cause contents to be compressed as you approach the limit of memory (meaning being right on the edge of using all RAM). The worst-case-scenario is where the OS "could" actually fit all pages in memory, but cannot due to that a portion of memory is devoted to compression. (kindof ironic)

    --This close to boundary situation can cause constant compression and decompression, given pages cannot be used until decompressed. If the OS is also processor starved and in a memory intensive game (most games are not, but emulators can be), then there can be a significant hit to this.
     
    Last edited: Feb 7, 2018
    AsiJu and mbk1969 like this.
  9. woobilicious

    woobilicious New Member

    Messages:
    5
    Likes Received:
    1
    GPU:
    RX580 4GB
    It's worth mentioning, because the memory compression is at a system critical portion of the OS, it takes CPU priority over everything else in the system, you can effectively DoS your computer with an app that uses heaps of memory, I have written apps that used 8GB+ in memory (building a huge in memory key-store) and the system would stop responding for 2-3mins while it was compressing, then uncompressing and then swapping various parts of memory, it's substantially quicker to just turn it off and let it directly swap to a fast SSD.
     
    mbk1969 likes this.
  10. Astyanax

    Astyanax Ancient Guru

    Messages:
    3,271
    Likes Received:
    844
    GPU:
    GTX 1080ti
    you misunderstand how the compression works, it isn't done on active pages, only idle pages after a time out period, so you cannot "dos your computer" with memory compression.

    Your system stopped responding because the CPU doing purely in memory operations was maxed out.
     

  11. woobilicious

    woobilicious New Member

    Messages:
    5
    Likes Received:
    1
    GPU:
    RX580 4GB
    You don't understand how the OS scheduler works, there's no way for an app just using some "memory" to stop a system from responding, my system doesn't lock up when doing memory hard operations like opening my password manager, unless it can trick a very high priority thread, like one in the kernel doing memory compression, to starve lower priority apps like the window manager, task manager, and mouse cursor.

    Turning off memory compression fixed my issue, the system became responsive while it built the database, this doesn't contradict your statement that it only evicts "idle pages", it's just your understanding on what a idle page is is wrong, if you're building a in memory database, when it runs out of memory windows will evict idle pages from other apps like Chrome, and then start evicting pages from my application that are idle, because it's trying to build a key-value store at >500MB/s, you become "memory bound" and CPU starved because windows memory compression is too slow to keep up with my SSD.
     
  12. Astyanax

    Astyanax Ancient Guru

    Messages:
    3,271
    Likes Received:
    844
    GPU:
    GTX 1080ti
    YOU don't understand how the os scheduler works.

    500MB/s is nothing, the ram can read and write at gigabytes a second.

    your cpu was the contention point.
     
  13. Alessio1989

    Alessio1989 Maha Guru

    Messages:
    1,377
    Likes Received:
    231
    GPU:
    .
    Modern hardware system can copy without freezing anything tens GB/s on RAM..
    What call did you to allocate such pile of memory?
     
    Last edited: Sep 17, 2019
    Astyanax likes this.

Share This Page