Sunday, 11 December 2022

SAP HANA memory consumption

1. Resident memory:


Physical memorythe actual RAM of the system attached on the motherboard.



To check the total physical number in SAP HANA:
----------------------------------------------------------------------------------------------------------------

DO BEGIN


DECLARE V_INSTANCE_NAME varchar(20);


select t1.VALUE into V_INSTANCE_NAME from  "SYS"."M_SYSTEM_OVERVIEW" t1

where UPPER(t1.NAME) = 'INSTANCE ID';


select V_INSTANCE_NAME as Instance_id, 

       round((FREE_PHYSICAL_MEMORY + USED_PHYSICAL_MEMORY)/1024/1024/1024, 3) as "Total Physical memory resource GB size" 

from PUBLIC.M_HOST_RESOURCE_UTILIZATION;


END;


------------------------------------------------------------------------------------------------------

Resident memorythe real physical memory that processes utilize in the current state.



The below query give:


how much resident memory is used by the HANA database.


----------------------------------------------------------------------------------------------------------------

SELECT HOST,

ROUND(USED_PHYSICAL_MEMORY/1024/1024/1024, 2) AS "Resident GB", 

ROUND((USED_PHYSICAL_MEMORY + FREE_PHYSICAL_MEMORY)/1024/1024/1024, 2) AS "Physical Memory

GB"

FROM PUBLIC.M_HOST_RESOURCE_UTILIZATION


-------------------------------------------------------------------------------------------------------------------



*resident memory used outside HANA (OS, 3rd party processes) is typically quite small, not exceeding 2 GB.

in case, to check free physical RAM  size using current. Just to crosscheck any other process apart from SAP HANA is consuming more RAM.

the below Linux command, need to run at OS level:

--------------------------------------------------------------------------------------------------------------------

FREE_LINUX_MEMORY=`vmstat -s | grep "free memory" | awk '{print $1}'`

RESULT=`echo "scale=3; $FREE_LINUX_MEMORY / 1024 / 1024" | bc -l`

echo "Free physical memoy : $RESULT GB"


-------------------------------------------------------------------------------------------------------------------

* DBACockpit does currently not consider shared memory as part of the resident memory.


Case -2:

2.If processes outside the database are not contributing significantly to memory usage, 2 cases have to be distinguished in general: 

1. An 'out of memory' error has already occurred.  - https://launchpad.support.sap.com/#/notes/1984422


What is OOM:

The memory requirements of SAP HANA exceed the available memory resources or a configured memory limit. When SAP HANA requires additional memory and is not able to allocate new memory or reclaim memory otherwise, an out-of memory (OOM) dump is generated and the action is terminated.

* By default OOM dump is written once in 24HR.

*you may modify the parameter oom_dump_time_delta . 


Major reasons for causing OOM:

* Expensive queries materializing results,

*too many/too large loaded tables,

*too large internal structures/system caches,

*blocked garbage collection by long-running cursors.

* too many sessions, connections, transactions, statements running.

HOW to access OOM file?

1) Go to Administration -> Diagnosis Files;

 2) Search for the Key word - OOM.

3) Consider index server file first.

4)Double click on the file.

5) Show Entire File - option need to select.


How to resolve OOM?

If multiple OOM dump files are generated you should start with analyzing the index server dump file, because in most cases the index server is responsible for the highest memory consumption.

* If you use the Diagnosis Files tab in SAP HANA Studio you have to make sure that the complete file is displayed (“Show Entire File”), otherwise some information can be missing.

The dump file typically contains the following sections that contain useful information for analyzing OOM issues:

[BUILD]        Build information
[THREADS]      Running threads
[STACK_SHORT]  Short call stacks and pending exceptions of all threads
[PROCESS_INFO] Process information
[OS_MEMORY]    Operating system information about memory
[MEMORY_OOM]   Information about current out of memory situation
[STAT]         Statistical data



Key points:
* time stamp of OOM error.

* version number.

* search for the key word - failed to allocate.

* current callstack - is the cause for OOM.

global_allocation_limit - 

* Top allocators 

     inclusive_size_in use - including of all sub allocators
     exclusive size in use -  including of only main allocators     of the heap memory.


Generally 3 kinds of situations in OOM:

1. GAL is less  then required  memory - need to Analysis - * if the table is row table, can convert in to column store table.
* Check the size of table.
* Table 
partition.

* all other possibilities to increase the performance of access the table.

* At last, we can consider to increase the size of RAM.

2.if GAL  is 1000GB

    memory required - 975GB

    failed to allocate - 25GB


* there is high probability to avoid the OOM in the next occurance.
* Can ignore the OOM in this case.

3. IF some cases, even though the size of GAL is more then the size of required memory size.

in this case also, OOM will created - 

GAL - 1000GB
failed to allocate - 0GB


Analysis - check all the OOM occurred at the same time.
               - in other case, if the above case is not cause. Then due to the case of previous SQL queries are not been removed from the memory.


-----------------------------------------------------------------------------------------------------------------

1. Login into the tenant database where the alert got triggered.                                                
2. Check the pool size on the tenant.                                                                                          
3. Communicate to DBA on the approval for SQL plan cache clearing                                     
4. Execute the below on the tenant after DBA approval :                                                           

ALTER SYSTEM CLEAR SQL PLAN CACHE
ALTER SYSTEM CLEAR METADATA CACHE

---------------------------------------------------------------------------------------------------------------

                                                                     
Case 3

2.An 'out of memory' error was not yet triggered, but further investigation on the current memory usage of the database might be required.



1) Go to Administration -> Diagnosis Files;

2) Select "Diagnosis Information -> Collect...";

3) Set the option 'Create and collect one or multiple sets of runtime environment (RTE) dump file(s)';

4) Set the hosts, amount of samples (sets) and frenquency on which the runtime dumps will be recorded;

5) Finish.


More info - 1813020 - How to generate a runtime dump on SAP HANA.


case 4: 

Top down approach:

Which HANA process on which host is using most of the memory?



SELECT TOP 3 HOST, PORT, SERVICE_NAME, TOTAL_MEMORY_USED_SIZE

FROM M_SERVICE_MEMORY

ORDER BY TOTAL_MEMORY_USED_SIZE DESC


Case 5:

So far, the sum of used heap and shared memory has been determined. Now, this total has to be split.





Memory that can be accessed by different processes, e.g.:

  • Specific row store components (tables, catalog, free)
  • Nameserver topology

SELECT TOP 3 HOST, PORT, SERVICE_NAME, HEAP_MEMORY_USED_SIZE, SHARED_MEMORY_USED_SIZE,

TOTAL_MEMORY_USED_SIZE

FROM M_SERVICE_MEMORY

ORDER BY SHARED_MEMORY_USED_SIZE DESC



CASE 6:    Trace file approach                                                                                       The first part of this section lists the local (heap) memory the processes that make up the HANA database are currently using:
[0] PID=34884, SId=42309944, compactors active, alive, process name: hdbnameserver
AB=2220466176b (2.06gb), UA=0b, U=2015851859b (1.87gb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=2220466176b (2.06gb), W=100

[1] PID=35049, SId=42310545, compactors active, alive, process name: hdbpreprocessor
AB=365772800b (348.82mb), UA=0b, U=362430594b (345.64mb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=365772800b (348.82mb), W=100

[3] PID=35094, SId=42310846, compactors active, alive, process name: hdbstatisticsse
AB=17623138051b (16.41gb), UA=0b, U=14624613181b (13.62gb), FSL=268435456b (256mb), PAL=27096579276b (25.23gb), TPAL=17757355779b (16.53gb), W=100

[4] PID=35114, SId=42310947, compactors active, alive, process name: hdbxsengine
AB=2270855168b (2.11gb), UA=0b, U=2136436039b (1.98gb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=2270855168b (2.11gb), W=100

[5] PID=33976, SId=171197412, compactors active, alive, process name: hdbindexserver
AB=240495694077b (223.97gb), UA=0b, U=260528715346b (242.63gb), FSL=0b, PAL=487738426982b (454.24gb), TPAL=270062153728b (251.51gb), W=100

For each process, the allocation limit (PAL),
the amount of memory currently allocated (AB)
used (U)

It might be the case that the request was that large that it was anyway impossible to fulfill it:

[MEMORY_OOM]  Information about current out of memory situation: 
OUT OF MEMORY occurred.
Failed to allocate 2405843009873693952 byte.

 A known issue regarding such large request is solved with revision 50. In case this occurs in newer revisions, please contact SAP Support.


CASE 7:

-----------------------------------------------------------------------------------------------------------------------------------

Change the Global Memory Allocation Limit

The SAP HANA database preallocates a pool of memory from the operating system over time, up to a predefined global allocation limit. You can change the default global allocation limit in the global.ini configuration file. SAP HANA One is pre-configured with 60.5 GB of memory in global_allocation_limit.

Prerequisites

You have the system privilege INIFILE ADMIN.

Context

The global_allocation_limit parameter is used to limit the amount of memory that can be used by the database. The value is the maximum allocation limit in MB. A missing entry or a value of 0 results in the system using the default settings. The global allocation limit is calculated by default as follows: 90% of the first 64 GB of available physical memory on the host plus 97% of each further GB. Or, in the case of small physical memory, physical memory minus 1 GB.

Procedure

  1. In the Administration editor, choose the Configuration tab.
    The configuration files that contain the configuration information for the system are displayed.
  2. Expand the global.ini configuration file and then the memorymanager section.
  3. In the context menu for the global_allocation_limit parameter, choose Change...
    The Change Configuration Value dialog box appears.
  4. Enter a value for the entire system and/or individual hosts.
-----------------------------------------------------------------------------------------------------------------------------

Case 8:

The results of shared memory creation can then be viewed on OS level using the ipcs command.

1. Get the process pid: ps -ef | grep 
2. ipcs -p | grep then displays all segments that were created by this particular process: 
ipcs -p | grep 4221 
86999065 hanadm 4221 4221 
87064602 hanadm 4221 4221 
87130139 hanadm 4221 4221 
87195676 hanadm 4221 4221 
87261213 hanadm 4221 4221 
87359519 hanadm 4221 4221 
88309819 hanadm 4221 4221 
88375356 hanadm 4221 4221 
88440894 hanadm 4221 4221 

3. The size of a particular segment can then be further examined using the command  ipcs -m -i  <id>

 ipcs -m -i 86999065 
Shared memory Segment shmid=86999065 
uid=60000 gid=100 cuid=60000 cgid=100 
mode=01600 access_perms=0600 
bytes=8929752 lpid=4221 cpid=4221 nattch=1 
att_time=Tue May 14 14:09:13 2013 
det_time=Tue May 14 14:09:13 2013 
change_time=Tue May 14 14:09:13 2013 

The sum of all those shared memory segments is then equivalent to the output of the statement:

SELECT SHARED_MEMORY_ALLOCATED_SIZE FROM M_SERVICE_MEMORY WHERE PROCESS_ID = '4221' 

Case 9: Row store reorganization



1813245 - SAP HANA Row Store Reorganization

Reason:

Row store grows by allocating a 64MB memory segments and shrinks by freeing empty segments.A segment is internally divided into fixed-size pages.
When a row store table requires more memory to store records, the table takes a free page from existing segments. If no segment has a free page, a new segment is allocated.

Deleting a large number of records may result in a number of sparse segments. In such a case, row store reorganization can be performed for memory compaction.
The pages in sparse segments are moved to other segments and the resultant empty segments are freed.




When to reorganize the row store

Row store reorganization is recommended, if the allocated row store table size is > 10GB and free page ratio is > 30%.

SELECT HOST, PORT, CASE WHEN (((SUM(FREE_SIZE) / SUM(ALLOCATED_SIZE)) > 0.30) AND SUM(ALLOCATED_SIZE) > TO_DECIMAL(10)*1024*1024*1024) THEN 'TRUE' ELSE 'FALSE' END "Row store Reorganization Recommended", TO_DECIMAL( SUM(FREE_SIZE)*100 / SUM(ALLOCATED_SIZE), 10,2) "Free Space Ratio in %",TO_DECIMAL( SUM(ALLOCATED_SIZE)/1048576, 10, 2) "Allocated Size in MB",TO_DECIMAL( SUM(FREE_SIZE)/1048576, 10, 2) "Free Size in MB" FROM M_RS_MEMORY WHERE ( CATEGORY = 'TABLE' ) and ( ALLOCATED_SIZE > 0 ) GROUP BY HOST, PORT;


in the result of above SQL command:

* Check ROW store Reorganization recommended column.
* Free space ratio  column 

need to consider.

If the result for Reorganization Recommended is TRUE, then row store memory can be reclaimed after row store reorganization.


How to estimate the amount of reclaimable memory

To estimate the maximum possible amount of memory that can be reclaimed, you can use below command:

SELECT HOST, PORT, FREE_SIZE/1024768 "Estimated Maximum Memory Saving in MB" FROM M_RS_MEMORY WHERE CATEGORY IN ('TABLE')



To below process will work for SAP HANA 2.0 >=SPS04:



  1. Online Row Store Reorganization v2
    1. Connect to the system
    2. Execute Online Row Store Reorganization using commands:
    • ALTER SYSTEM RECLAIM DATA SPACE

or

    • CALL REORGANIZE_ROWSTORE('REORGANIZE')

    1. To monitor the progress, check the check the monitoring view M_JOB_PROGRESS, check the Job Progress Tab in Performance Tab of the HANA Studio, or use the JobProgress in Show Statement Library in Cockpit.
    2. Check the indexserver trace files after row store reorganization is done. The trace of Online Row Store Reorganization is separately managed in indexserver_<hostname>.<port>.row_store_reorg.<sequence>.trc
      • [000000000000][0] 0000-00-00 00:00:00.000000 i RowStoreReor Executor.cc(00033) : ==== Reorganize started ====
      • [000000000000][0] 0000-00-00 00:00:00.000000 i RowStoreReor Executor.cc(00241) : Number of target segments: 0
      • [000000000000][0] 0000-00-00 00:00:00.000000 i RowStoreReor Executor.cc(00242) : Expected memory saving: 000 (MB)
      • [000000000000][0] 0000-00-00 00:00:00.000000 i RowStoreReor Executor.cc(00518) : PostReorg: emptied 0 segments

    1. Check the memory fragmentation status after reorganization using:
    • CALL REORGANIZE_ROWSTORE('DISPLAY_MEMORY_STATUS') 


For offline Row Store Reorganization:

please refer Snote - 
1813245

9.1

Indirectly, high memory usage of the row store can cause problems when parameter
client_distribution_mode is set to 'off' in distributed environments. When default value 'statement' is used, the statement is sent to the node where the table is located. With the setting 'off', the statement
might then also be directed to the master node. Since all row store tables are usually located on the
master node and consequently, the row store uses a lot of memory, materializing a large amount of
data on the master node (from a table that is actually located on another node) can then simply be too
much. Therefore, in case of problems with resolving internal and external hostnames, parameter
client_distribution_mode should not be turned off.

9.2: Too many large tables were created as row store tables. The largest tables currently contained in the row store can be retrieved by the following SQL Statement:


SELECT TOP 50 * FROM M_RS_TABLES ORDER BY (ALLOCATED_FIXED_PART_SIZE + ALLOCATED_VARIABLE_PART_SIZE) DESC


It is however important to keep in mind that in many scenarios, switching a table from rowstore to columnstore and vice versa must not be done. This applies to Suite on HANA / BW systems which are delivered with particular settings for different table types. At any case, it makes sense to check whether the system is currently adhering to the recommended configuration or not. SAP note 1659383 contains a list of tables that are supposed to be in the rowstore and a check that is based on this list is part of report RSDU_TABLE_CONSISTENCY. This report also contains a repair option when a deviation is recognized.

 
CASE: 10 - Heap memory analysis.

* In most cases, the process of interest is the index server, since it usually allocates the greatest share of the available memory. 



the below query is used to show inclusive size to exclusive size:

---------------------------------------------------------------------------------------------------------------------------------------------------
SELECT TOP 15 MS.HOST, MS.SERVICE_NAME,MH.CATEGORY, MH.INCLUSIVE_SIZE_IN_USE, MH.EXCLUSIVE_SIZE_IN_USE FROM M_HEAP_MEMORY MH, M_SERVICES MS WHERE MH.PORT = MS.PORT AND MH.HOST = MS.HOST AND MS.SERVICE_NAME = 'indexserver' ORDER BY 4 DESC

----------------------------------------------------------------------------------------------------------------------------------------------------


few case studies:

Pool/parallel/ihm There are cases where this allocator used 90% of the available memory. If such a problem is observed, the system should be upgraded to revision 54 when available. Pool/itab

This allocator is needed for the intermediate result of join operation and translation table used for join operation. This could indicate use of a suboptimal model.

Pool/FemsCompression/CompositeFemsCompressionPool/FemsCompression/CompositeFemsCompression 

A known issue is associated with high memory usage that is solved with revision 53. FEMS compression is used in BW systems to keep the size of the dataset that is transferred to the application server as small as possible. 

Pool/ValueArray 

This is the allocator which keeps the complete resultset in uncompressed form. A known issue has been solved with revision 55 which addresses memory used by this allocator not released after oom situations, thus steadily increasing over time. The details are documented in SAP note 1852425. 

Pool/JoinEvaluator/TranslationTable 

Translation tables are created for caching join column entries of join engine queries. The number of translation tables that are kept in memory is determined by the the following parameter in indexserver.ini: 

[joins]
 translator_cache_size = n 

The default value is 2000. In order to verify how many translation tables currently exist in the system, the following statement can be used: 
select * from sys.M_DEV_JOIN_TRANSLATION_TABLES_ order by TRANSLATION_TABLE_MEMORY_SIZE desc 

This statement sums up the memory usage of those tables: 
select host, port, transtable_memory from sys.m_dev_joinengine_ 

In case of excessive memory usage for this allocator, the parameter can be set to 500 as a starting point. Lowering the value can have an impact on query performance though. However, in live systems where this was done, no negative side effects were observed sofar.

 Pool/malloc/libhdbcsapi.so 


A memory leak was introduced with revision 54 that can lead to a huge growth of this allocator when BW queries containing lots of OR terms instead of IN lists are executed. A possible workaround from application side is the implementation of SAP note 1786777. From HANA side, the problem is solved starting with revision 60 (HANA SP6). See also SAP note 1880274. 

For further information on the meaning of HANA Memory allocators and known reasons why they could consume a high amount of memory please refer to question 13. What can I do if a certain heap allocator is unusually large? from the knowledge base article 1999997 - FAQ: SAP HANA Memory. 

Case 11 - Memory Usage of the Statistics server 

Even though the statistics server can also run into oom errors, it should not consume a large share of the available memory. The statistics server is supposed to have a much smaller allocation limit compared to the index server. This following two configuration parameters are relevant: 

statisticsserver.ini -> 
memory manager

allocationlimit 
minallocationlimit 

The allocation limit of the statisticsserver is either a percentage of the availably RAM or just the value of minallocationlimit, depending on what is the bigger value. In case of oom errors that indicate that the statisticsserver has reached its process allocation limit, it can make sense to increase the allocationlimit to 10% or 15% to have at least a workaround for the time being. However, further analysis regarding the heap memory usage has to be conducted (see above). In the oom trace file, such a situation would look like this: 

[3] PID=23840, SId=320543005, compactors active, alive, process name: hdbstatisticsse 

AB=6755876864b (6.29gb), U=7135415505b (6.64gb), FSL=0b, PAL=6772669235b (6.30gb), TPAL=6772669235b (6.30gb), W=100 

AB, PAL and TPAL are on the same values. Also in this case, further investigation regarding the involved pools is necessary, either using the information contained in the trace files or by using the following SQL statement: 

-----------------------------------------------------------------------------------------------------------------------
SELECT TOP 15 MS.HOST, MS.SERVICE_NAME,MH.CATEGORY, MH.INCLUSIVE_SIZE_IN_USE, MH.EXCLUSIVE_SIZE_IN_USE FROM M_HEAP_MEMORY MH, M_SERVICES MS WHERE MH.PORT = MS.PORT AND MH.HOST = MS.HOST AND MS.SERVICE_NAME = 'statisticsserver' ORDER BY 4 DESC 
-------------------------------------------------------------------------------------------------------------------------


For further information on memory issues with the statistics server please refer to the SAP note 1929538 - HANA Statistics Server - Out of Memory.