A bit detailed info about SuperFetch and cache manager in Windows (relying on memory manager)

mbk1969 · Jan 30, 2018

From "Windows Internals" Seventh Edition:

Proactive memory management (SuperFetch)
Traditional memory management in operating systems has focused on the demand-paging model discussed thus far, with some advances in clustering and prefetching so that disk I/O can be optimized at the time of the demand-page fault. Client versions of Windows, however, include a significant improvement in the management of physical memory with the implementation of SuperFetch, a memory management scheme that enhances the least–recently accessed approach with historical file access information and proactive memory management.

Components
SuperFetch has several components in the system that work hand in hand to proactively manage memory and limit the impact on user activity when SuperFetch is performing its work. These components include the following:

Tracer The tracer mechanisms are part of a kernel component (Pf) that allows SuperFetch to query detailed page-usage, session, and process information at any time. SuperFetch also uses the FileInfo mini-filter driver (%SystemRoot%\System32\Drivers\Fileinfo.sys) to track file usage.

Page access while the user was active

Page access by a foreground process

Hard fault while the user was active

Page access during an application launch

Page access upon the user returning after a long idle period

Scenario manager This component, also called the context agent, manages the three SuperFetch scenario plans: hibernation, standby, and fast-user switching. The kernel-mode part of the scenario manager provides APIs for initiating and terminating scenarios, managing the current scenario state, and associating tracing information with these scenarios.

Rebalancer Based on the information provided by the SuperFetch agents, as well as the current state of the system (such as the state of the prioritized page lists), the rebalancer — a specialized agent in the Superfetch user-mode service — queries the PFN database and reprioritizes it based on the associated score of each page, thus building the prioritized standby lists. The rebalancer can also issue commands to the memory manager that modify the working sets of processes on the system, and it is the only agent that actually takes action on the system. Other agents merely filter information for the rebalancer to use in its decisions. In addition to reprioritization, the rebalancer initiates prefetching through the prefetcher thread, which uses FileInfo and kernel services to preload memory with useful pages.

All these components use facilities inside the memory manager that allow for the querying of detailed information about the state of each page in the PFN database, the current page counts for each page list and prioritized list, and more. SuperFetch components also use prioritized I/O to minimize user impact.

Tracing and logging
SuperFetch makes most of its decisions based on information that has been integrated, parsed, and post-processed from raw traces and logs, making these two components among the most critical. Tracing is like ETW in some ways because it uses certain triggers in code throughout the system to generate events, but it also works in conjunction with facilities already provided by the system, such as power-manager notification, process callbacks, and file-system filtering. The tracer also uses traditional page-aging mechanisms that exist in the memory manager, as well as newer working-set aging and access tracking implemented for SuperFetch.

SuperFetch always keeps a trace running and continuously queries trace data from the system, which tracks page usage and access through the memory manager’s access bit tracking and working set aging. To track file-related information, which is as critical as page usage because it allows prioritization of file data in the cache, SuperFetch leverages existing filtering functionality with the addition of the FileInfo driver. This driver sits on the file-system device stack and monitors access and changes to files at the stream level, which provides it with fine-grained understanding of file access. The main job of the FileInfo driver is to associate streams — identified by a unique key, currently implemented as the FsContext field of the respective file object — with file names so that the user-mode Superfetch service can identify the specific file stream and offset that a page in the standby list belonging to a memory-mapped section is associated with. It also provides the interface for prefetching file data transparently, without interfering with locked files and other file-system state.

The rest of the driver ensures that the information stays consistent by tracking deletions, renaming operations, truncations, and the reuse of file keys by implementing sequence numbers.

At any time during tracing, the rebalancer might be invoked to repopulate pages differently. These decisions are made by analyzing information such as the distribution of memory within working sets, the zero page list, the modified page list and the standby page lists, the number of faults, the state of PTE access bits, the per-page usage traces, current virtual address consumption, and working set size.

A given trace can be a page-access trace, in which the tracer uses the access bit to keep track of which pages were accessed by the process (both file page and private memory). Or, it can be a namelogging trace, which monitors the file name–to–file key mapping updates to the actual file on disk. These allow SuperFetch to map a page associated with a file object.

Although a SuperFetch trace only keeps track of page accesses, the Superfetch service processes this trace in user mode and goes much deeper, adding its own richer information such as where the page was loaded from (for example, in resident memory or a hard page fault), whether this was the initial access to that page, and what the rate of page access actually is. Additional information, such as the system state, is also kept, as well as information about recent scenarios in which each traced page was last referenced. The generated trace information is kept in memory through a logger into data structures, which — in the case of page-access traces — identify traces, a virtual address–to–working set pair, or, in the case of a name-logging trace, a file-to-offset pair. SuperFetch can thus keep track of which range of virtual addresses for a given process have page-related events and which range of offsets for a given file have similar events.

Scenarios
One aspect of SuperFetch that is distinct from its primary page reprioritization and prefetching mechanisms (covered in more detail in the next section) is its support for scenarios, which are specific actions on the machine for which SuperFetch strives to improve the user experience. These scenarios are as follows:

Hibernation The goal of hibernation is to intelligently decide which pages are saved in the hibernation file other than the existing working-set pages. The idea is to minimize the amount of time it takes for the system to become responsive after a resume.

Standby The goal of standby is to completely remove hard faults after resume. Because a typical system can resume in less than 2 seconds, but can take 5 seconds to spin up the hard drive after a long sleep, a single hard fault could cause such a delay in the resume cycle. SuperFetch prioritizes pages needed after a standby to remove this chance.

Fast user switching The goal of fast user switching is to keep an accurate priority and understanding of each user’s memory. That way, switching to another user will cause the user’s session to be immediately usable, and won’t require a large amount of lag time to allow pages to be faulted in.

Each of these scenarios has different goals, but all are centered around the main purpose of minimizing or removing hard faults.

Scenarios are hardcoded, and SuperFetch manages them through the NtSetSystemInformation and NtQuerySystemInformation APIs that control system state. For SuperFetch purposes, a special information class, SystemSuperfetchInformation, is used to control the kernel-mode components and to generate requests such as starting, ending, and querying a scenario or associating one or more traces with a scenario.

Each scenario is defined by a plan file, which contains, at minimum, a list of pages associated with the scenario. Page priority values are also assigned according to certain rules (described next). When a scenario starts, the scenario manager is responsible for responding to the event by generating the list of pages that should be brought into memory and at which priority.

Page priority and rebalancing
You’ve already seen that the memory manager implements a system of page priorities to define which standby list pages will be repurposed for a given operation and in which list a given page will be inserted. This mechanism provides benefits when processes and threads have associated priorities—for example, ensuring that a defragmenter process doesn’t pollute the standby page list and/or steal pages from an interactive foreground process. But its real power is unleashed through SuperFetch’s page prioritization schemes and rebalancing, which don’t require manual application input or hardcoded knowledge of process importance.

SuperFetch assigns page priority based on an internal score it keeps for each page, part of which is based on frequency-based usage. This usage counts how many times a page was used in given relative time intervals, such as by hour, day, or week. The system also keeps track of time of use, recording how long it’s been since a given page was accessed. Finally, data such as where this page comes from (which list) and other access patterns is used to compute this score.

The score is translated into a priority number, which can be anywhere from 1 to 6. (A priority of 7 is used for another purpose, described later.) Going down each level, the lower standby page list priorities are repurposed first. Priority 5 is typically used for normal applications, while priority 1 is meant for background applications that third-party developers can mark as such. Finally, priority 6 is used to keep a certain number of high-importance pages as far away as possible from repurposing. The other priorities are a result of the score associated with each page.

Because SuperFetch “learns” a user’s system, it can start from scratch with no existing historical data and slowly build an understanding of the different page accesses associated with the user. However, this would result in a significant learning curve whenever a new application, user, or service pack was installed. Instead, by using an internal tool, Windows can pre-train SuperFetch to capture SuperFetch data and then turn it into prebuilt traces. These prebuilt traces were generated by the SuperFetch team, who traced common usages and patterns that all users will probably encounter, such as clicking the Start menu, opening Control Panel, or using the File Open/Save dialog box. This trace data was then saved to history files (which ship as resources in Sysmain.dll) and is used to prepopulate the special priority 7 list. This list is where the most critical data is placed and is rarely repurposed. Pages at priority 7 are file pages kept in memory even after the process has exited and even across reboots (by being repopulated at the next boot). Finally, pages with priority 7 are static, in that they are never reprioritized, and SuperFetch will never dynamically load pages at priority 7 other than the static pretrained set.

The prioritized list is loaded into memory (or prepopulated) by the rebalancer, but the actual act of rebalancing is handled by both SuperFetch and the memory manager. As shown, the prioritized standby page list mechanism is internal to the memory manager, and decisions as to which pages to throw out first and which to protect are innate, based on the priority number. The rebalancer does its job not by manually rebalancing memory but by reprioritizing it, which causes the memory manager to perform the needed tasks. The rebalancer is also responsible for reading the actual pages from disk, if needed, so that they are present in memory (prefetching). It then assigns the priority that is mapped by each agent to the score for each page, and the memory manager ensures that the page is treated according to its importance.

The rebalancer can take action without relying on other agents — for example, if it notices that the distribution of pages across paging lists is suboptimal or that the number of repurposed pages across different priority levels is detrimental. The rebalancer can also trigger working-set trimming, which might be required for creating an appropriate budget of pages that will be used for SuperFetch prepopulated cache data. The rebalancer will typically take low-utility pages — such as those that are already marked as low priority, that are zeroed, or that have valid content but not in any working set and have been unused — and build a more useful set of pages in memory, given the budget it has allocated itself. After the rebalancer has decided which pages to bring into memory and at which priority level they need to be loaded (as well as which pages can be thrown out), it performs the required disk reads to prefetch them. It also works in conjunction with the I/O manager’s prioritization schemes so that I/Os are performed with very low priority and do not interfere with the user.

The memory consumption used by prefetching is backed by standby pages. As described in the discussion of page dynamics, standby memory is available memory because it can be repurposed as free memory for another allocator at any time. In other words, if SuperFetch is prefetching the wrong data, there is no real impact on the user because that memory can be reused when needed and doesn’t actually consume resources.

Finally, the rebalancer also runs periodically to ensure that pages it has marked as high priority have actually been recently used. Because these pages will rarely (sometimes never) be repurposed, it is important not to waste them on data that is rarely accessed but may have appeared to be frequently accessed during a certain period. If such a situation is detected, the rebalancer runs again to push those pages down in the priority lists.

A special agent called the application launch agent is involved in a different kind of prefetching mechanism, which attempts to predict application launches and builds a Markov chain model that describes the probability of certain application launches given the existence of other application launches within a time segment. These time segments are divided across four different periods of roughly 6 hours each — morning, noon, evening, and night — and by weekday or weekend. For example, if on Saturday and Sunday evening a user typically launches Outlook after having launched Word, the application launch agent will likely prefetch Outlook based on the high probability of it running after Word during weekend evenings.

Robust performance
A final performance-enhancing functionality of SuperFetch is called robustness, or robust performance. This component — managed by the user-mode Superfetch service but ultimately implemented in the kernel (Pf routines) — watches for specific file I/O access that might harm system performance by populating the standby lists with unneeded data. For example, if a process were to copy a large file across the file system, the standby list would be populated with the file’s contents, even though that file might never be accessed again (or not for a long period of time). This would throw out any other data within that priority — and if this was an interactive and useful program, chances are its priority would be at least 5.

SuperFetch responds to two specific kinds of I/O access patterns:

Sequential file access With this type of I/O access pattern, the system goes through all the data in a file.

Sequential directory access With this type of I/O access, the system goes through every file in a directory.

When SuperFetch detects that a certain amount of data past an internal threshold has been populated in the standby list as a result of this kind of access, it applies aggressive deprioritization (called robustion) to the pages being used to map this file. This occurs within the targeted process only so as not to penalize other applications. These pages, which are said to be robusted, essentially become reprioritized to priority 2.

Because this component of SuperFetch is reactive and not predictive, it does take some time for the robustion to kick in. SuperFetch will therefore keep track of this process for the next time it runs. Once SuperFetch has determined that it appears that this process always performs this kind of sequential access, it remembers this and robusts the file pages as soon as they’re mapped instead of waiting for the reactive behavior. At this point, the entire process is now considered robusted for future file access.

Just by applying this logic, however, SuperFetch could potentially hurt many legitimate applications or user scenarios that perform sequential access in the future. For example, by using the Sysinternals Strings.exe utility, you can look for a string in all executables that are part of a directory. If there are many files, SuperFetch would likely perform robustion. Now, next time you run Strings.exe with a different search parameter, it would run just as slowly as it did the first time even though you’d expect it to run much faster. To prevent this, SuperFetch keeps a list of processes that it watches into the future, as well as an internal hard-coded list of exceptions. If a process is detected to later re-access robusted files, robustion is disabled on the process to restore the expected behavior.

The main point to remember when thinking about robustion — and SuperFetch optimizations in general — is that SuperFetch constantly monitors usage patterns and updates its understanding of the system to avoid fetching useless data. Although changes in a user’s daily activities or application startup behavior might cause SuperFetch to pollute the cache with irrelevant data or to throw out data that it might think is useless, it will quickly adapt to any pattern changes. If the user’s actions are erratic and random, the worst that can happen is that the system will behave in a similar state as if SuperFetch were not present at all. If SuperFetch is ever in doubt or cannot track data reliably, it quiets itself and doesn’t make changes to a given process or page.

mbk1969 · Aug 21, 2020

A bit of info about Windows keeping track of physical memory:

Page frame number database
Several previous sections concentrated on the virtual view of a Windows process—page tables, PTEs, and VADs. The remainder of this chapter will explain how Windows manages physical memory, starting with how Windows keeps track of physical memory. Whereas working sets describe the resident pages owned by a process or the system, the PFN database describes the state of each page in physical memory. The page states are listed in Table 5-19.
TABLE 5-19 Physical page states
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Active (also called valid)

The page is part of a working set (either a process working set, a session working set, or a system working set), or it’s not in any working set (for example, a non-paged kernel page) and a valid PTE usually points to it.

Transition

This is a temporary state for a page that isn’t owned by a working set and isn’t on any paging list. A page is in this state when an I/O to the page is in progress. The PTE is encoded so that collided page faults can be recognized and handled properly. (This use of the term transition differs from the use of the word in the section on invalid PTEs. An invalid transition PTE refers to a page on the standby or modified list.)

Standby

The page previously belonged to a working set but was removed or was prefetched/clustered directly into the standby list. The page wasn’t modified since it was last written to disk. The PTE still refers to the physical page but it is marked invalid and in transition.

Modified

The page previously belonged to a working set but was removed. However, the page was modified while it was in use and its current contents haven’t yet been written to disk or remote storage. The PTE still refers to the physical page but is marked invalid and in transition. It must be written to the backing store before the physical page can be reused.

Modified no-write

This is the same as a modified page except that the page has been marked so that the memory manager’s modified page writer won’t write it to disk. The cache manager marks pages as modified no-write at the request of file system drivers. For example, NTFS uses this state for pages containing file system metadata so that it can first ensure that transaction log entries are flushed to disk before the pages they are protecting are written to disk. (NTFS transaction logging is explained in Chapter 13, “File systems,” in Part 2.)

Free

The page is free but has unspecified dirty data in it. For security reasons, these pages can’t be given as a user page to a user process without being initialized with zeroes, but they can be overwritten with new data (for example, from a file) before being given to a user process.

Zeroed

The page is free and has been initialized with zeroes by the zero page thread or was determined to already contain zeroes.

Rom

The page represents read-only memory.

Bad

The page has generated parity or other hardware errors and can’t be used (or used as part of an enclave).
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
...
Of the page states listed in Table 5-19, six are organized into linked lists so that the memory manager can quickly locate pages of a specific type. (Active/valid pages, transition pages, and overloaded “bad” pages aren’t in any system-wide page list.) Additionally, the standby state is associated with eight different lists ordered by priority.

Page list dynamics
Figure 5-37 shows a state diagram for page frame transitions. For simplicity, the modified-no-write, bad and ROM lists aren’t shown.

Page frames move between the paging lists in the following ways:
■ When the memory manager needs a zero-initialized page to service a demand-zero page fault (a reference to a page that is defined to be all zeroes or to a user-mode committed private page that has never been accessed), it first attempts to get one from the zero page list. If the list is empty, it gets one from the free page list and zeroes the page. If the free list is empty, it goes to the standby list and zeroes that page.
One reason zero-initialized pages are needed is to meet security requirements such as the Common Criteria (CC). Most CC profiles specify that user-mode processes be given initialized page frames to prevent them from reading a previous process’s memory contents. Thus, the memory manager gives user-mode processes zeroed page frames unless the page is being read in from a backing store. In that case, the memory manager prefers to use non-zeroed page frames, initializing them with the data off the disk or remote storage. The zero page list is populated from the free list by the zero page thread system thread (thread 0 in the System process). The zero page thread waits on a gate object to signal it to go to work. When the free list has eight or more pages, this gate is signaled. However, the zero page thread will run only if at least one processor has no other threads running, because the zero page thread runs at priority 0 and the lowest priority that a user thread can be set to is 1.
■ When the memory manager doesn’t require a zero-initialized page, it goes first to the free list. If that’s empty, it goes to the zeroed list. If the zeroed list is empty, it goes to the standby lists. Before the memory manager can use a page frame from the standby lists, it must first backtrack and remove the reference from the invalid PTE (or prototype PTE) that still points to the page frame. Because entries in the PFN database contain pointers back to the previous user’s page table page (or to a page of prototype PTE pool for shared pages), the memory manager can quickly find the PTE and make the appropriate change.
■ When a process must give up a page out of its working set either because it referenced a new page and its working set was full or the memory manager trimmed its working set, the page goes to the standby lists if the page was clean (not modified) or to the modified list if the page was modified while it was resident.
■ When a process exits, all the private pages go to the free list. Also, when the last reference to a page-file-backed section is closed, and the section has no remaining mapped views, these pages also go to the free list.

mbk1969 · Aug 21, 2020

A bit of info on cache manager relying on memory manager:

Chapter 11. Cache Manager
The cache manager is a set of kernel-mode functions and system threads that cooperate with the memory manager to provide data caching for all Windows file system drivers (both local and network). In this chapter, we’ll explain how the cache manager, including its key internal data structures and functions, works; how it is sized at system initialization time; how it interacts with other elements of the operating system; and how you can observe its activity through performance counters. We’ll also describe the five flags on the Windows CreateFile function that affect file caching.

Key Features of the Cache Manager
The cache manager has several key features:

Supports all file system types (both local and network), thus removing the need for each file system to implement its own cache management code

Uses the memory manager to control which parts of which files are in physical memory (trading off demands for physical memory between user processes and the operating system)

Caches data on a virtual block basis (offsets within a file)—in contrast to many caching systems, which cache on a logical block basis (offsets within a disk volume)—allowing for intelligent read-ahead and high-speed access to the cache without involving file system drivers (This method of caching, called fast I/O, is described later in this chapter.)

Supports “hints” passed by applications at file open time (such as random versus sequential access, temporary file creation, and so on)

Supports recoverable file systems (for example, those that use transaction logging) to recover data after a system failure

Although we’ll talk more throughout this chapter about how these features are used in the cache manager, in this section we’ll introduce you to the concepts behind these features.

Single, Centralized System Cache
Some operating systems rely on each individual file system to cache data, a practice that results either in duplicated caching and memory management code in the operating system or in limitations on the kinds of data that can be cached. In contrast, Windows offers a centralized caching facility that caches all externally stored data, whether on local hard disks, floppy disks, network file servers, or CD-ROMs. Any data can be cached, whether it’s user data streams (the contents of a file and the ongoing read and write activity to that file) or file system metadata (such as directory and file headers). As you’ll discover in this chapter, the method Windows uses to access the cache depends on the type of data being cached.

The Memory Manager
One unusual aspect of the cache manager is that it never knows how much cached data is actually in physical memory. This statement might sound strange because the purpose of a cache is to keep a subset of frequently accessed data in physical memory as a way to improve I/O performance. The reason the cache manager doesn’t know how much data is in physical memory is that it accesses data by mapping views of files into system virtual address spaces, using standard section objects (file mapping objects in Windows API terminology). (Section objects are the basic primitive of the memory manager and are explained in detail in Chapter 10.) As addresses in these mapped views are accessed, the memory manager pages in blocks that aren’t in physical memory. And when memory demands dictate, the memory manager unmaps these pages out of the cache and, if the data has changed, pages the data back to the files.

By caching on the basis of a virtual address space using mapped files, the cache manager avoids generating read or write I/O request packets (IRPs) to access the data for files it’s caching. Instead, it simply copies data to or from the virtual addresses where the portion of the cached file is mapped and relies on the memory manager to fault in (or out) the data into (or out of) memory as needed. This process allows the memory manager to make global trade-offs on how much memory to give to the system cache versus how much to give to user processes. (The cache manager also initiates I/O, such as lazy writing, which is described later in this chapter; however, it calls the memory manager to write the pages.) Also, as you’ll learn in the next section, this design makes it possible for processes that open cached files to see the same data as do processes that are mapping the same files into their user address spaces.
...

Fast I/O
Whenever possible, reads and writes to cached files are handled by a high-speed mechanism named fast I/O. Fast I/O is a means of reading or writing a cached file without going through the work of generating an IRP, as described in Chapter 8. With fast I/O, the I/O manager calls the file system driver’s fast I/O routine to see whether I/O can be satisfied directly from the cache manager without generating an IRP.

Because the cache manager is architected on top of the virtual memory subsystem, file system drivers can use the cache manager to access file data simply by copying to or from pages mapped to the actual file being referenced without going through the overhead of generating an IRP.

Fast I/O doesn’t always occur. For example, the first read or write to a file requires setting up the file for caching (mapping the file into the cache and setting up the cache data structures, as explained earlier in the section Cache Data Structures). Also, if the caller specified an asynchronous read or write, fast I/O isn’t used because the caller might be stalled during paging I/O operations required to satisfy the buffer copy to or from the system cache and thus not really providing the requested asynchronous I/O operation. But even on a synchronous I/O, the file system driver might decide that it can’t process the I/O operation by using the fast I/O mechanism, say, for example, if the file in question has a locked range of bytes (as a result of calls to the Windows LockFile and UnlockFile functions). Because the cache manager doesn’t know what parts of which files are locked, the file system driver must check the validity of the read or write, which requires generating an IRP. The decision tree for fast I/O is shown in Figure 11-11.

These steps are involved in servicing a read or a write with fast I/O:

A thread performs a read or write operation.

If the file is cached and the I/O is synchronous, the request passes to the fast I/O entry point of the file system driver stack. If the file isn’t cached, the file system driver sets up the file for caching so that the next time, fast I/O can be used to satisfy a read or write request.

If the file system driver’s fast I/O routine determines that fast I/O is possible, it calls the cache manager’s read or write routine to access the file data directly in the cache. (If fast I/O isn’t possible, the file system driver returns to the I/O system, which then generates an IRP for the I/O and eventually calls the file system’s regular read routine.)

The cache manager translates the supplied file offset into a virtual address in the cache.

For reads, the cache manager copies the data from the cache into the buffer of the process requesting it; for writes, it copies the data from the buffer to the cache.

One of the following actions occurs:

For reads where FILE_FLAG_RANDOM_ACCESS wasn’t specified when the file was opened, the read-ahead information in the caller’s private cache map is updated. Read-ahead may also be queued for files for which the FO_RANDOM_ACCESS flag is not specified.

For writes, the dirty bit of any modified page in the cache is set so that the lazy writer will know to flush it to disk.

For write-through files, any modifications are flushed to disk.

EdKiefer · Aug 21, 2020

Where does Superfetch store it data for processes, is it C:\Windows\Prefetch or in the Readyboost folder?

And the $10K question how much does it really improve with modern SSD's

mbk1969 · Aug 21, 2020

EdKiefer said: ↑

Where does Superfetch store it data for processes, is it C:\Windows\Prefetch or in the Readyboost folder?
Click to expand...

C:\Windows\Prefetch

EdKiefer said: ↑

And the $10K question how much does it really improve with modern SSD's
Click to expand...

Compare the speed of RAM to SSD.

EdKiefer · Aug 21, 2020

mbk1969 said: ↑

C:\Windows\Prefetch

Compare the speed of RAM to SSD.
Click to expand...

I just happened to turn off sysmain service two days ago and have not noticed any slower starting of any app, maybe if I stopwatch it might see some small factor.
I did notice memory compression gets turned off and now I wonder if it be better to leave sysmain on but tune off

Disable-MMAgent -ApplicationPreLaunch
Disable-MMAgent -PageCombining
Disable-MMAgent -MemoryCompression

wonder what the difference would be.
Already had ApplicationPreLaunch disabled.

Edit: Nice write up on it by the way.

mbk1969 · Aug 22, 2020

EdKiefer said: ↑

I just happened to turn off sysmain service two days ago and have not noticed any slower starting of any app, maybe if I stopwatch it might see some small factor.
I did notice memory compression gets turned off and now I wonder if it be better to leave sysmain on but tune off

Disable-MMAgent -ApplicationPreLaunch
Disable-MMAgent -PageCombining
Disable-MMAgent -MemoryCompression

wonder what the difference would be.
Already had ApplicationPreLaunch disabled.

Edit: Nice write up on it by the way.
Click to expand...

These are copy-n-paste from the "Windows Internals" book.

As I take it, ApplicationPreLaunch is for UWP applications.

I saw two or three times faster start of the game(s). The test should be: launch the game (you played at previous days) after the start, then quit the game and clear standby list(s), then launch the game again. I experimented with standby lists developing my version of standby list solution.
Superfetch utilises cool mathematics to predict what will you launch today (gathering and analysing statistic data), so to predict its prediction we have to understand the logic. For example, it can be that if you launch specific game each Friday for several weeks, Superfetch will pre-load it (namely) at Fridays.

A special agent called the application launch agent is involved in a different kind of prefetching mechanism, which attempts to predict application launches and builds a Markov chain model that describes the probability of certain application launches given the existence of other application launches within a time segment. These time segments are divided across four different periods of roughly 6 hours each — morning, noon, evening, and night — and by weekday or weekend. For example, if on Saturday and Sunday evening a user typically launches Outlook after having launched Word, the application launch agent will likely prefetch Outlook based on the high probability of it running after Word during weekend evenings.
Click to expand...

PS Prefetching (I mean pre-launching) the apps is not the primary purpose of SuperFetch overall (it seems to me).

mbk1969 · Aug 21, 2020

@EdKiefer
https://docs.microsoft.com/en-us/powershell/module/mmagent/enable-mmagent?view=win10-ps

The Enable-MMAgent cmdlet enables any or all of the following features:

Application launch prefetching

Operation recorder API functionality

Page combining

Application prelaunching

Specify the ApplicationLaunchPrefetching parameter to help improve application startup performance. Application launch prefetching causes the memory manager agent to monitor the data and code that applications access. The memory management agent then uses that information to preload the data and code into physical memory for subsequent startups.

Specify the ApplicationPreLaunch parameter to help improve application startup performance. Application prelaunch can speculatively launch applications that the user is likely to use in the near future, thus reducing application switch time.

Specify the OperationAPI parameter to help speed up operations that repeatedly access the same file data. Enabling this feature exposes the Windows prefetching mechanism as a public interface.

Specify the PageCombining parameter to help reduce the physical memory that the operating system uses. Page combining causes the memory manager to periodically combine pages in physical memory that have identical content.
Click to expand...

No mentioning of UWP apps, so I don`t know where I got this...

EdKiefer · Aug 21, 2020

mbk1969 said: ↑

@EdKiefer
https://docs.microsoft.com/en-us/powershell/module/mmagent/enable-mmagent?view=win10-ps

No mentioning of UWP apps, so I don`t know where I got this...
Click to expand...

Ok, so I need them on then, I will leave memory compression and page combining off.

mbk1969 · Aug 21, 2020

EdKiefer said: ↑

Ok, so I need them on then, I will leave memory compression and page combining off.
Click to expand...

Good old days
https://forums.guru3d.com/threads/get-mmagent-superfetch-checkup.419234/

EdKiefer · Aug 22, 2020

mbk1969 said: ↑

Good old days
https://forums.guru3d.com/threads/get-mmagent-superfetch-checkup.419234/
Click to expand...

Yup
I think that is were I copy-pasted some lines in a doc.

Astyanax · Aug 22, 2020

Disabling memory compression has been linked to stuttering on Windows 10 since 1903

mbk1969 · Aug 22, 2020

Astyanax said: ↑

Disabling memory compression has been linked to stuttering on Windows 10 since 1903
Click to expand...

By whom? On what rigs? With what amount of RAM?

I suspect that with enough amount of RAM disabling memory compression (and page combining) you remove related CPU activity - little bonus.

And A2Razor guru from old post (I linked) stated the increase of FPS after disabling both memory compression and page combining:

Note that I've turned off Memory Compression and Page Combining (gained 3% performance doing that in BOTW [even though nothing is compressed according to Task Manager] ). Everything else is as it were when the install was fresh, system has 64GB of ram.
[Page Combining was "ON" initially on my Windows 10 install, and "OFF" on the Windows Server 2016 -- which also has Compression OFF]

Side note, after manually enabling SuperFetch (which isn't done by default on Windows Server), CEMU performance is comparable and stuttering drops massively.
Click to expand...

But his rig had impressive 64GB of RAM.

EdKiefer · Aug 22, 2020

Astyanax said: ↑

Disabling memory compression has been linked to stuttering on Windows 10 since 1903
Click to expand...

no, issue on my end yet.

PS: I have 16GB of memory and with my usage, I never go much above 50%. I am not a big multi-tasker.
The reason I disabled it was because it seems to aggressive to me, it starts compressing at like 25% of use. I wish there was a setting for how aggressive it is.

Astyanax · Aug 22, 2020

EdKiefer said: ↑

no, issue on my end yet.
Click to expand...

probably had busy systems memory wise, but the way compression works should only be in idle cycles and never wasting cpu time during major load.

mbk1969 · Aug 23, 2020

In any case (1) users can switch on/off to test, and (2) getting rid of even small CPU load can`t be the cause of stutters.

EdKiefer · Aug 23, 2020

Ok, so it might compress when the system is idle but what about uncompress when a memory load is needed by a process, that would need to be dynamic I think.
Right, it is easy to switch it on or off.

Astyanax · Aug 24, 2020

EdKiefer said: ↑

Ok, so it might compress when the system is idle but what about uncompress when a memory load is needed by a process, that would need to be dynamic I think.
Right, it is easy to switch it on or off.
Click to expand...

it uses the same Xpress scheme that you can apply on NTFS file systems with compact.exe, it should be barely measurable.

Smough · Aug 25, 2020

Can someone give me a TL; DR for dummies? I mean as in for games, is it better to keep this on or off? I see some people debate this, but I'd like to know. I use SSD as O.S drive, so I don't know if it could help.

mbk1969 · Aug 25, 2020

Smough said: ↑

Can someone give me a TL; DR for dummies? I mean as in for games, is it better to keep this on or off? I see some people debate this, but I'd like to know. I use SSD as O.S drive, so I don't know if it could help.
Click to expand...

What exactly?
To pre-launch apps? - yes, if it pre-launched app you are about to launch.
To optimize the apps start? - yes, without a doubt.

Log in or Sign up

A bit detailed info about SuperFetch and cache manager in Windows (relying on memory manager)

mbk1969 Ancient Guru

mbk1969 Ancient Guru

mbk1969 Ancient Guru

EdKiefer Ancient Guru

mbk1969 Ancient Guru

EdKiefer Ancient Guru

mbk1969 Ancient Guru

mbk1969 Ancient Guru

EdKiefer Ancient Guru

mbk1969 Ancient Guru

EdKiefer Ancient Guru

Astyanax Ancient Guru

mbk1969 Ancient Guru

EdKiefer Ancient Guru

Astyanax Ancient Guru

mbk1969 Ancient Guru

EdKiefer Ancient Guru

Astyanax Ancient Guru

Smough Master Guru

mbk1969 Ancient Guru

Share This Page