flyonok

统计

随笔 - 179
文章 - 37
评论 - 23
引用 - 0

留言簿(8)

ACE

book

boost

bsd

embed system

erlang

ET++

patterns study source

gtk

ic card

ic卡安全通信

java

IBM--java tech
java--cn
thrift
thrift apache java

KDE

kde API doc

libevent

linux

distribute filesystem
IBM develop
IBM linux 文档库
IBM linux doc
KDE
kde website
linux -- ipc
linux Assembly HOWTO
linux command for programmer
linux help
linux kernel code browse
find kernel code of linux
linux kernel map code
学习linux代码的好地方
linux man
Linux 动态库剖析
动态链接的共享库是 GNU/Linux® 的一个重要方面。该种库允许可执行文件在运行时动态访问外部函数，从而（通过在需要时才会引入函数的方式）减少它们对内存的总体占用。本文研究了创建和使用静态库的过程，详细描述了开发它们的各种工具，并揭秘了这些库的工作方式。
linux 内核解析
linux_kernel_book
linux的开源项目一览
IBM上的linux开源项目
linux技术中坚站
linux时代
Linux系统管理者手册
linux下的路由配置系统
linux伊甸园
linuxedn
LPI(Linux Professional Institute)
IBM
netfilter-bridge
open source code
opengroup--linux
oreilly linux
orilly
ubuntu study
winpcap
程序的链接和装入及Linux下动态链接的实现
关于linux 802.1d (bridge) 和 802.1q(vlan) 实现的再思考
红联
linux联盟
内核之旅
陈莉君
鸟哥的linux安全配置
其中selinux讲的不错
鳥哥的 Linux 私房菜
应用 Valgrind 发现 Linux 程序的内存问题
中国linux联盟
中国源码

linux--MM

mysql

network education

one card

oracle

pcap relation

php

powerbuilder

python

django
django doc
sqlalchemy
python sql package
web3py book

QT

qt china
qt chinese doc
QT develop center
how to study programing with QT
QT Eng doc
QT fornum
Qt 学习
qt study
QT-win32-compile
how to compile with mingw32-make on win32 platform
sis analysis

software config

中文软件配置管理网站

software test

SQL server

UML

wireless

wxwidgets

陈宾

山庄
山庄文档

阅读排行榜

评论排行榜

Finding Performance Bottlenecks in Linux

There isn't a computer professional who, at some point, hasn't wondered whether their system(s) are slow due to legitimate load, or inefficiency. The beauty is there's no real reason to sit and wonder. In the case of Linux (and many other operating systems), all of the information you need is at your fingertips. You just have to know how to find it.

Computing bottlenecks occur in four basic areas: CPU, RAM, network, and disk I/O. Linux offers a huge collection of tools for collecting and viewing information about each. Let's take a look at some useful techniques, and some of the easier solutions to each area if you find problems.

CPU Performance Inspection

Most new computers today come with multiple CPUs, or some approximation thereof. Some tools allow you to view the individual performance of each of these. However, since the goal here is to measure overall performance, this article focuses on working with a single CPU value. See the man pages for each command for whether it offers flags to go further.

One excellent tool for monitoring CPU performance is sar. This program may not be installed by default on your system, look for the sysstat utilities package for your distribution. Typing sar without any arguments gives you something similar to what you'll see in Figure 1.

Figure 1: An example of default sar output.

From left to right, sar gives you the time the measurement was taken, which CPU it's reporting on (or in our case, all as a collective whole), and then the percentage of CPU in use at that time for:

%user - User space (non-kernel programs)
%nice - Programs whose priority had been altered with the nice or renice commands
%system - Kernel space (the kernel itself plus modules)
%iowait - Waiting to fulfill a disk I/O request
%steal - Forced to wait for the hypervisor to finish servicing another virtual CPU, in the case of virtual machines
%idle - Waiting for new instructions

While all of these columns are interesting, the one that quickly lets you determine if you're CPU-bound is %idle. In the case of Figure 1, this CPU (or bank of CPUs) is practically at the beach on vacation. If the numbers were significantly higher, you would need to consider upgrading the CPU, stopping unnecessary processes, or moving some of the services off of this computer and onto another to improve CPU utilization.

RAM Performance Inspection

The nice thing about sar is that you can also use it to look at your memory. When invoked as sar -r, you see something similar to Figure 2.

Figure 2: An example of sar memory output, invoked with sar -r.

From left to right, this output tells us the time the sample was taken, and then:

kbmemfree - Unused memory in kb
kbmemused - Amount of memory utilized by user space applications in kb
%memused - The percentage of your RAM currently in use
kbbuffers - Amount of memory in kb that your kernel is using to buffer data
kbcached - Amount of memory in kb that the kernel is using to cache data
kbswpfree - Unused swap space in kb
kbswpused - Used swap space in kb
%swpused - The percentage of your swap space currently in use
kbswpcad - Amount of cached swap in kb

Again, while all of these columns are useful, two give you a quick picture of whether your problem is with memory: %memused, and %swpused. While Figure 1 showed a CPU that was sunning itself in Aruba, %memused shows that this computer is consistently operating at the edge of its RAM capacity. The %swpused column tells us that on the other hand, the machine isn't being pushed so hard that it's having to move code from RAM into swap space on the hard drive. For the timespan shown in the measurements, then, you aren't experiencing poor performance.

However, don't be alarmed by the fact that this machine looks like it's one step from having to push things into swap. The kernel's memory manager will put the most active applications in physical RAM (in ps's STAT column or top's S column you'll see R for running), and the idle applications into swap (in ps or top these will show as S for sleeping), so just the raw percentages of how much RAM and swap you're using don't show the whole picture. Typing ps aux will let you see how many processes at a particular time are sleeping, and what percentage of memory (and CPU) each is using. Knowing how much RAM, how much swap, and how many processes are sleeping, along with how much RAM these processes are using, will help you better understand if you're having RAM bottlenecks. Factors such as shared memory can also make it look like you're using more RAM than you really are.

The solutions for improving RAM performance are similar to those for CPU: add more RAM, stop unnecessary programs, or move some of your services off onto another machine. It's also possible that you're suffering memory leaks or that something you're running is very RAM-inefficient. These topics bear further discussion in another article.

Disk I/O Performance Inspection

Yet another reason to use sar is that this Swiss army knife of performance information tools can also tell you how your drives are doing. Type sar -dp and you'll see something like what's shown in Figure 3.

Figure 3: The beginning of sar I/O output, invoked with sar -dp.

This combination of flags shows you information per device, as seeing just the summary information (sar -b) doesn't give you any real reference points at a glance. From left to right, this output gives you the time the measurement was taken, as well as:

DEV - The physical device in question
rd_sec/s - Number of sectors (1 sector = 512 bytes) read per second
wr_sec/s - Number of sectors written per second
avgrq-sz - Average number of sectors issued to the device
avgqu-sz - Average queue length of requests issued to the device
await - Average number of milliseconds I/O requests for this device had to wait before being handled, including how long it took to handle them
svctm - Average time number of milliseconds I/O requests for this device had to wait before being handled
%util - Percentage of CPU time taken up by I/O requests being issued to the device

Notice in this case that the percentage is not the most interesting value here. Avgqu-sz and svctm are the two most useful values for determining if you have an I/O-bound machine. The longer the queue, the more requests are piling up before they're being serviced. The longer they have to wait before being serviced, the slower everything gets.

On an I/O-bound machine, solutions include faster drives (including RAID arrays and other remote storage), organizing your partitions so that I/O-heavy programs aren't all trying to write to the same physical drive, and of course splitting off services onto other machines to spread the load. Very high disk I/O values could in fact mean that you're using a lot of swap.

Network Performance Inspection

While sar (as sar -n ALL) can also show you network performance data, in this case it's a bit of overkill. A quick ifconfig(you may need to include the path) can give you some basic information for a quick visual inspect, as shown in Figure 4.

Figure 4: Network information displayed with /sbin/ifconfig.

The key to understanding this output for performance monitoring purposes is to know that T stands for Transmit and R stands for Receive. If you see values greater than zero for errors, dropped, overruns, and collisions, then you may very well have a network bottleneck problem. The first thing to do is check all of your connections, and equipment such as switches and hubs. Also, check at a few different times and see if the problem is persistent. If it continues, it bears further investigation.

In the case of all four of these issues, this article just skims the surface of both investigation techniques and solutions. In general, you'll want to take these measurements multiple times to see if the problems are persistent or come and go. You might even want to set up cron jobs to take these measurements on an automatic basis.

Further installments will address the larger issues of monitoring performance over time, making tweaks that don't involve having to upgrade hardware, and things developers can do to address performance issues with their own software.

Dee-Ann LeBlanc is a freelance writer, editor, trainer, course developer, and journalist essentially specializing in helping people better understand Linux and open source.

posted on 2011-04-21 16:44 flyonok 阅读(393) 评论(0) 编辑收藏引用所属分类: linux

只有注册用户登录后才能发表评论。
【推荐】100%开源！大型工业跨平台软件C++源码提供，建模，组态！

相关文章: ctags+taglist+cscope vim ctags cscope的配合使用阅读源码 Install Language support in CentOS 5 or Red Hat Enterprise Linux Linux 2.4中netfilter框架实现 Linux Layer 7 Netfilter QOS 實作成功紀錄虚拟平台在嵌入式中的应用--转自csdn Linux 2.4.x 网络协议栈QoS模块(TC)的设计与实现 Finding Performance Bottlenecks in Linux Booting Bare Hardware Linkers and Loaders--page 4 of 4

网站导航: 博客园 IT新闻 BlogJava 博问 Chat2DB 管理