IBM REDP-4285-00 User Manual

4285ch04.fm

Draft Document for Review May 4, 2007 11:35 am

112

Linux Performance and Tuning Guidelines

/dev/sdd2 swap swap sw,pri=1 0 0

Swap partitions are used from the highest priority to the lowest (where 32767 is the highest
and 0 is the lowest). Giving the same priority to the first three disks causes the data to be
written to all three disks; the system does not wait until the first swap partition is full before it
starts to write on the next partition. The system uses the first three partitions in parallel and
performance generally improves.

The fourth partition is used if additional space is needed for swapping after the first three are
completely filled up. It is also possible to give all partitions the same priority to stripe the data
over all partitions, but if one drive is slower than the others, performance would decrease. A
general rule is that the swap partitions should be on the fastest drives available.

4.5.3 HugeTLBfs

This memory management feature is valuable for applications that use a large virtual address
space. It is especially useful for database applications.

The CPU’s Translation Lookaside Buffer (TLB) is a small cache used for storing virtual-to-
physical mapping information. By using the TLB, a translation can be performed without
referencing the in-memory page table entry that maps the virtual address. However, to keep
translations as fast as possible, the TLB is typically quite small. It is not uncommon for large
memory applications to exceed the mapping capacity of the TLB.

The HugeTLBfs feature permits an application to use a much larger page size than normal,
so that a single TLB entry can map a correspondingly larger address space. A HugeTLB
entry can vary in size. For example, in an Itanium® 2 system, a huge page might be 1000
times larger than a normal page. This enables the TLB to map 1000 times the virtual address
space of a normal process without incurring a TLB cache miss. For simplicity, this feature is
exposed to applications by means of a file system interface.

To allocate hugepage, you can define number of hugepages by configuring value at
/proc/sys/vm/nr_hugepages

using sysctl command.

sysctl -w vm.nr_hugepages=512

If your application use huge pages through the mmap() system call, you have to mount a file
system of type hugetlbfs like this:

mount -t hugetlbfs none /mnt/hugepages

/proc/meminfo

file will provide information about hugetlb pages as shown in Example 4-12.

Example 4-12 Hugepage information in /proc/meminfo

[root@lnxsu4 ~]# cat /proc/meminfo
MemTotal:      4037420 kB
MemFree:       386664 kB
Buffers:         60596 kB
Cached:        238264 kB

Important: Although there are good tools to tune the memory subsystem, frequent page
outs should be avoided as much as possible. The swap space is not a replacement for
RAM because it is stored on physical drives that have a significantly slower access time
than memory. Then frequent page out (or swap out) may is almost never a good behavior.
Before trying to improve the swap process, ensure that your server simply has enough
memory or that there is no memory leak.