Article
Checking UNIX Server Performance
Now, let's go through this information step by step.
7:06pm up 81 days, 7:47, 1 user, load average: 2.90, 2.25, 2.14
The first number is the server time, followed by how many days and hours ago the server was last rebooted. The next figure shows how many users are connected to the machine (through protocols such as SSH). Lastly, come the server load averages. The first number indicates the average number of processes running over the past minute. The second indicates the average over the last 5 minutes. The third shows the average over the past 15 minutes. The next line displays information about process, but we’ll come back to that in a moment. The lines following these first two provide CPU usage statistics (in this example, the server has two CPUs):
CPU0 states: 66.3% user, 14.4% system, 27.1% nice, 18.4% idle
CPU1 states: 71.1% user, 14.4% system, 37.1% nice, 13.2% idle
The foremost CPU is identified by its number (starting at 0). Processor execution modes fall into two categories – non-privileged mode (user), and privileged mode (system). Privileged mode is often called “kernel” (system) mode; non-privileged mode may be referred to as “user” mode. If executed in non-privileged mode, a process will be able to access only its own memory, whereas in privileged mode, access is available to all of the kernel's data structures, as well as the underlying hardware. The kernel executes processes in non-privileged mode to prevent user processes from accessing data structures or hardware registers that may affect other processes or the operating environment.
On a UNIX system, each process runs according to what's called a scheduling priority level. A process called “scheduler” distributes CPU time to processes according to their priority, or level. Processes with higher priority level (-20 being the highest) get to run ahead of those with a lower priority. Idle percentage displays the amount of CPU power that isn't in use. The following two lines give figures about memory and swap file usage:
Mem: 3867832K av, 3853780K used, 14052K free, 0K shrd, 100388K buff
Swap: 530104K av, 45980K used, 484124K free 3228192K cached
The first number represents the total amount available, while the second displays amounts used in kilobytes. The third is the amount free, again in kilobytes, and lastly are the buffer and cache stats. The buffer is the space to which a system writes, before writing to disk, and cache optimizes the performance of your programs.
Processes
In the process of analyzing the beginning of a typical top result, we skipped a line about processes, and a few other figures. All of these statistics refer to things called processes. Processes are the programs being executed, or run, on a Web server. Let's start off with the 2nd line of a top result we omitted before.
214 processes: 209 sleeping, 5 running, 0 zombies, 0 stopped
The line starts with a count of total processes. Afterwards are explanations of the categories into which each of those processes falls. There are four default categories for processes: sleeping, running, zombie, and stopped. The majority of the processes fall into the sleeping category – they’re processes waiting in a queue to be worked on, and once their turn comes, they become running processes. Few processes are running -- those are the processes being worked on right now.
During the life of a sleeping process, child processes can be born. Normally, a main process would kill its child processes before it ended, however, if that doesn't occur, the child process will keep on going without its parent. When this occurs, this "lost child" is classified as a zombie process. A high number of zombies is one thing a system admin dreads, though fortunately those processes are terminated by the system at predefined intervals. Finally, stopped processes are those that are paused, typically because they’re waiting for a resource.
As you would have noticed, at the bottom of the top result was a big list – this is a more thorough listing of the processes included in the second line of a top result. Here's an example of a part of the list.
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
10685 ze-card 19 19 984 736 724 R N 93.3 0.0 5671m webalizer
32072 nobody 9 0 16932 15M 12156 S 6.0 0.3 0:01 httpd
32196 nobody 9 0 16224 14M 12144 S 3.5 0.3 0:01 httpd
1868 nobody 9 0 16284 14M 12620 S 3.1 0.3 0:00 httpd
2136 nobody 9 0 16080 14M 12164 S 2.5 0.3 0:00 httpd
32205 nobody 9 0 16300 14M 12136 S 2.3 0.3 0:00 httpd
32231 nobody 9 0 16316 14M 12172 S 2.3 0.3 0:00 httpd
32124 nobody 9 0 16620 14M 12184 S 1.9 0.3 0:01 httpd
On top of the list are short descriptions of each column, and in order to understand the listings it's a good idea to know what each column shows. The first column shows the unique identifier of a task, called a “process id”. The “User” column displays the owner of the task. “PRI” is short for priority, which symbolizes how important a task is. “NI” stands for Nice, also a queue-related factor, where negative values are given a higher priority.
The size of the task's code, data, and stack space (in kilobytes) are next. “RSS” is the total physical memory utilization of the process. “SHARE” represents shared memory. “STAT” is an abbreviation of the word “state”, and refers to the condition of a process. Here, “S” stands for sleeping, “D” for uninterruptible sleep, “R” for running, “Z” for zombies, “T” for stopped or traced, “N” for a process with positive nice value, “W” for a swapped out process. %CPU and %MEM indicate the share of the total CPU power or memory used since the last screen update. “TIME” shows the total amount of CPU time that the process has used in its lifetime (i.e. the amount of time it has spent in the ‘running’ state). And lastly, “COMMAND” displays the command the task is running.