How to find who is using/eating up the Virtual Address space on your SQL Server

Well, this is often a tricky situation, where you are running into Virtual Address Fragmentation and getting OOM (out-of-memory) errors in your SQL Server. Most often, people do not understand the difference between physical memory pressure vs. virtual memory pressure. Adding more RAM is definitely not a solution here!         Here are some sample error messages you might have seen in your SQL Errorlog which will indicate if this is physical memory or virtual memory issue,

 SQL 2000
WARNING: Failed to reserve contiguous memory of Size= 65536.
SQL Server could not spawn process_loginread thread.      

SQL 2005 and above
Failed Virtual Allocate Bytes: FAIL_VIRTUAL_RESERVE 122880            

 Error: 701, Severity: 17, State: 123.
There is insufficient system memory to run this query.     

As the errors above indicate, the problem is in reserving a fixed size in the virtual address space of SQL Server. Note: the size indicated above in in bytes. e.g. 65536/1024 = 64 KB

Typical symptoms you would notice in these situations are :-

1) Database or Log Backups might start failing
2) You are unable to make a new connections to SQL.
3) Certain jobs which require memory from non-BPool region will fail.  Many a time, these problems go away automatically without any action taken. This indicates an intermittent problem where at a certain point in time, there was virtual memory pressure which resulted in above messages being printed to the SQL Errorlog.  Dealing with VAS fragmentation or running out of VAS on SQL server 2000 was rather painful and required setting up additional debugging techniques to get down to the bottom of the issue. You might have used some of the following tools :-

1) VMSTAT
2) Debugging Tools for Windows (to capture a manual dump of sqlservr.exe)
3) T2551 to generate a filtered dump in SQL Server when running into a OOM condition.
4) TLIST.exe to identify modules loaded in SQL Server.  Luckily, starting with SQL 2005 there is a in-memory DMV which tracks the virtual address space (VAS) of your SQL Server process.

Here are some queries which will help you find out how much virtual address is available on sqlservr.exe which is FREE and how much is total available (Free+InUse)
 

1. Will tell you the size of the biggest contiguous block in VAS

 

SELECT convert(varchar,getdate(),120) as [Timestamp], max(region_size_in_bytes)/1024 [Total max contiguous block size in KB]  from sys.dm_os_virtual_address_dump where region_state = 0x00010000 --- MEM_FREE 
2. Will also tell us size of largest contiguous block plus the region marked as MEM_RESERVE (this is your non-BPool area reserved during SQL Startup, sometimes referred to as MTL – MemToLeave)

 

 

With VASummary(Size,Reserved,Free) AS
(SELECT
    Size = VaDump.Size,
    Reserved =  SUM(CASE(CONVERT(INT, VaDump.Base)^0)
    WHEN 0 THEN 0 ELSE 1 END),
    Free = SUM(CASE(CONVERT(INT, VaDump.Base)^0)
    WHEN 0 THEN 1 ELSE 0 END)
FROM
(
    SELECT  CONVERT(VARBINARY, SUM(region_size_in_bytes))
    AS Size, region_allocation_base_address AS Base
    FROM sys.dm_os_virtual_address_dump 
    WHERE region_allocation_base_address <> 0x0
    GROUP BY region_allocation_base_address 
UNION  
    SELECT CONVERT(VARBINARY, region_size_in_bytes), region_allocation_base_address
    FROM sys.dm_os_virtual_address_dump
    WHERE region_allocation_base_address  = 0x0
)
AS VaDump
GROUP BY Size)  
SELECT SUM(CONVERT(BIGINT,Size)*Free)/1024 AS [Total avail Mem, KB] ,CAST(MAX(Size) AS BIGINT)/1024 AS [Max free size, KB] 
FROM VASummary 
WHERE Free <> 0


 3. Now, below query will identify the memory reserved by non-BPool components in SQL Server

select SUM(virtual_memory_reserved_kb)/1024 as virtual_memory_reserved_mb from sys.dm_os_memory_clerks where type not like '%bufferpool%' 
4. To identify if any of the space used is by SQL Server procedure cache itself, you can use this query:

 

 

SELECT SUM(PAGESUSED)*8/1024 'MB of MemToLeave memory consumed by procedures' FROM MASTER.DBO.SYSCACHEOBJECTS WHERE PAGESUSED >1

DBCC MEMORYSTATUS also has good information on whether the usage from the non-BPool region is coming from SQL or non-SQL components. If its SQL 2000, check the value of “OS Committed” and it is SQL 2005/2008 look at the value of “MultiPage Allocator” for each memory clerk. Just to re-state my assumption in case you are not sure:
1 SQL Server Page = 8 KB –> SinglePage Allocator
> 1 Page or > 8KB –> MultiPage Allocator    Apart from these you need to pay special attention to the following components are all of the below do not use memory from the Buffer Pool region but make direct VirtualAlloc() calls to reserve memory and then commit them:

 1) Any OLE/COM components loaded in SQL Server
2) Extended Stored Procedures (use sys.dm_os_loaded_modules to identify the module loaded in sqlserver process space).
3) SQL Mail components
4) Any prepared documents using sp_xml_preparedocument
5) Linked Server Providers
6) Large Plans stored in Procedure Cache
7) Very frequent Backups also may cause MTL area depletion. (Please investigate using the parameters MAXTRANSFERSIZE and BUFFERCOUNT, if this is the case).
8) SQL CLR (recommended to be used on 64-bit SQL Servers)

Above list is certainly not exhaustive, but is more enough to get started in looking at the right areas. While we are on this topic, it is important to understand the difference between a Reserve and a Commit. These are windows concepts are remain the same for SQL Server as well, after all its VirtualAlloc() underneath the covers.

MEM_COMMIT – Region in VAS this is backed by RAM/paging file
MEM_RESERVE – Region in VAS with no actual physical storage either in RAM or in the paging file.  The problems we talked about so far occur when a call to VirtualAlloc() with MEM_RESERVE is made and that “reservation” fails resulting in the errors printed to errorlog. Most times, the call to reserve is subsequently followed by a COMMIT, but its not mandatory. I can reserve now and commit later on using the base address of the reservation. SQL Server is smart enough and during starting reserves a certain region of the address space referred to as MTL or Non-BPool region. It is here the crunch is and it is here the issue needs to be investigated/fixed.

A quick workaround for above issues is to add the startup parameter –gXXX. (Sample usage- -g512)
XXX- is the amount in MBytes to reserve on startup.I would advise against doing this as this is a workaround where you are increasing the MTL region rather than find out who/what is consuming it. Slava’s blog is a good read also on this topic.

 

MTL Consumers

A previous post of mine talked about SQL Server memory architecture in a nutshell. Let’s find out what are the common consumers of the much talked about MemToLeave region and what you should be looking for in-case you run into MemToLeave issues.

Major Consumers of MTL

    1. Use of Linked Servers – You can find out the linked servers that being used in your environment using the sysservers system catalog.
    2. Use of XML documents – You would have to find out if any queries or procedures perform any kind of XML data manipulation or use sp_xml_preparedocument.
    3. Extended Stored Procedures or sp_OAcreate calls
      1. Extended stored procedure usage can be identified by inspecting the SQL Server Errorlog and searching for the word using. The first XSP calls is logged in the Errorlog in the following manner: Using ‘<dll name>’ version ‘<version>’ to execute extended stored procedure ‘<XSP name>’.
      2. If you are using sp_OAcreate, then this information would be logged in the SQL Errorlog for the first invocation of sp_OAcreate using the same pattern mentioned above. The only difference would be that the DLL name would be odsole70.dll.
    4. Query Plans which are larger than 8 KB in size
      1. If you are using SQL Server 2000, query syscacheobjects and search for entries which have values greater than 1 for the pagesused column.
      2. If you are using SQL Server 2005, search for plans greater than 8KB using the DMV sys.dm_exec_cached_plans. Inspect the size_in_bytes column value and search for entries which have values greater than 8192 bytes.
    5. Use of SQLCLR (Applicable to SQL Server 2005 and above) – Check the memory usage for SQLCLR clerk using DBCC MEMORYSTATUS.
    6. Backups using larger MAXTRANSFERSIZE parameter – In such a case find out BACKUP DATABASE/LOG commands and inspect the MAXTRANSFERSIZE parameter value. The lowest size that can be provided is 65536 bytes.
    7. Using Network Packet Size higher than 8192 bytes. You can find all connections on SQL Server 2005 that use more than 8KB of connection using the following DMV: select * from sys.dm_exec_connections:

select * from sys.dm_exec_connections
where net_packet_size > 8192

How to find the largest free block in the MTL

Often we need to find the largest free block when troubleshooting “MemToLeave” issues with SQL Server Memory. Since, this is related to virtual address space and we didn’t expose any method to query the process address space on SQL Server 2000, you would have to rely on a tool called VMSTAT to find out the largest free block. This tool is used by CSS to troubleshoot MTL issues. However in SQL Server 2005, we added a new DMV which would inform you the status of the virtual address space for SQL. The query below can be used to find the largest contiguous block on the system.

If you are getting the following error:

WARNING:  Failed to reserve contiguous memory of Size= 2162688 (*This value is always in bytes)

If the query below is run during the problem period, then you will find one of the following conditions to be true:

1. The value returned by the query is larger than the value in the error (once you have done the necessary conversions) – This means that when the error occurred, the MemToLeave region of SQL Server was being used by some component due to which a current allocation request failed. This is due to a virtual address space crunch.

2. The value returned by the query is lower than the requested size – This means that you either have a virtual address space fragmentation issue or some component has used up all the available MemToLeave region. (A lot of my colleagues believe that MTL is a misnomer but I will continue to use these throughout the post as it makes referencing the memory region simpler.) In such a case, it would always be advisable to get a Microsoft CSS Engineer to setup data collection tools in place for the next problem occurrence and analyze the existing logs for any tell-tale signs.

Query  (for 64-bit plaform)

SELECT convert(varchar,getdate(),120) as [Timestamp], max(region_size_in_bytes)/1024 [Total max contiguous block size in KB] from sys.dm_os_virtual_address_dump where region_state = 0x0000000000010000 --MEM_FREE
 
Query  (for 32-bit plaform)
SELECT convert(varchar,getdate(),120) as [Timestamp], max (region_size_in_bytes)/1024 [Total max contiguous block size in KB] from sys.dm_os_virtual_address_dump where region_state = 0x00010000 -- MEM_FREE
 
For more information on this DMV, read about it on sys.dm_os_virtual_address_dump
NOTE: The above query is very very performance intensive and shouldn’t be run very frequently on a production system. On a 64-bit system, frequent use of this query can actually degrade performance severely due to the large virtual address space of 64-bit systems.

SQL Server Memory Basics

I will give a brief account of how SQL Server memory is organized (in a nutshell). There is much more to it under the covers! 🙂

SQL Server Memory space is organized in the following manner:

 

Memory


Each instance of Microsoft SQL Server 2000 has an address space with two main components, each of which has several subcomponents:

There are two main areas of memory within SQL Server’s address space, the buffer pool (BPool) and a second memory pool sometimes called the “MemToLeave” area. The contents of the SQL Server buffer pool include cached table data, workspace memory used during query execution for in-memory sorts or hashes, most cached stored procedure and query plans, memory for locks and other internal structures, and the majority of other miscellaneous memory needs of the SQL Server. The Bpool handles memory allocations for 8K and below. All allocation requirements above 8K is satisfied from the MTL region.

Executable code

  1. The number and size of the executable files and dynamic link libraries (DLLs) used by an instance of SQL Server varies over time. In addition to the executable files and DLLs used by Open Data Services, the SQL Server engine, and server Net-Libraries, the following components load in their own DLLs, and these DLLs can allocate memory themselves:
  2. Distributed queries can load an OLE DB Provider DLL on the server running the instance of SQL Server.
  3. Extended stored procedures are implemented as DLLs that are loaded into the address space of the instance of SQL Server.
  4. The OLE Automation system stored procedures are used to create instances of OLE Automation objects. Each class of OLE Automation object loads its own code into the address space of the instance of SQL Server.

Memory pool

The memory pool is the main unit of memory for an instance of SQL Server. Almost all data structures that use memory in an instance of SQL Server are allocated in the memory pool. The main types of objects allocated in the memory pool are:

System-level data structures

These are data structures that hold data global to the instance, such as database descriptors and the lock table.

Buffer cache

This is the pool of buffer pages into which data pages are read.

Procedure cache

This is a pool of pages containing the execution plans for all Transact-SQL statements currently executing in the instance.

Log caches

Each log has a cache of buffer pages used to read and write log pages. The log caches are managed separately from the buffer cache to reduce the synchronization between log and data buffers. This results in fast, robust code.

Connection context

Each connection has a set of data structures that record the current state of the connection. These data structures hold items such as parameter values for queries and stored procedures, cursor positioning information, and tables currently being referenced.

The high level memory architecture remains the same in SQL Server 2005 and 2008. The difference is that we introduced a change called Memory Clerks which were responsible for memory allocation requests for requirements for a particular object/cache store. The allocators changed to Single Page Allocator (any allocation required below 8KB) and Multi Page Allocator (any allocation above 8KB). With the help of SPA and MPA, we are now able to track Buffer Pool and non-Buffer Pool memory requests in a more structured manner when components within SQL Server are requesting for memory. However, if your DLL/CLR code hosted within SQL Server memory is calling VirtualAlloc() directly, then tracking those requires extensive data collection and analysis.

Useful Articles

How to track SQL Server 2000 memory usage using DBCC Memorystatus

How to track SQL Server 2005 memory usage using DBCC Memorystatus