 |
Level: Intermediate
Sean A. Walberg
(sean@ertw.com),
Senior Network Engineer
31 Mar 2007
Applications using the LAMP (Linux®,
Apache, MySQL, PHP/Perl) architecture are constantly being developed
and deployed. But often the server administrator has little control
over the application itself because it's written by someone else. This
series of three articles discusses many of the server configuration
items that can make or break an application's performance. This first
article covers the LAMP architecture, some measurement techniques, and
some basic Linux kernel, disk, and file system tweaks. Successive
articles investigate tuning the Apache, MySQL, and PHP components.
Linux, Apache, MySQL, and PHP (or Perl) are the
foundation of many Web applications, from to-do lists to blogs to
e-commerce sites. WordPress and Pligg are but two common software
packages powering high-volume Web sites. This architecture has come to
be known simply as LAMP. Almost every distribution of Linux includes
Apache, MySQL, PHP, and Perl, so installing the LAMP software is almost
as easy as saying it.
This ease of installation gives the impression
that the software runs itself, which is simply not true. Eventually the
load on the application outgrows the settings that come bundled with
the back-end servers and application performance suffers. LAMP
installations require constant monitoring, tuning, and evaluation.
Tuning a system has different meanings to
different people. This series of articles focuses on tuning the LAMP
components -- Linux, Apache, MySQL, and PHP. Tuning the application
itself is yet another complex matter. There is a symbiotic relationship
between the application and the back-end servers: a poorly tuned server
causes even the best application to fail under load, and there's only
so much tuning one can do to a server before a badly written
application slows to a crawl. Fortunately, proper system tuning and
monitoring can point to problems in the application.
The LAMP architecture
The first step in tuning any system is
understanding how it works. At the simplest level, a LAMP-based
application is written in a scripting language such as PHP that runs as
part of the Apache Web server that is running on a Linux host.
The PHP application takes information from the
client through the requested URL, any form data, and whatever session
information has been captured to determine what it is supposed to do.
If needed, the server pulls information from a MySQL database (also
running on Linux), combines the information with some Hypertext Markup
Language (HTML) templates, and returns it to the client. This process
repeats itself as the user navigates the application and also occurs in
parallel as multiple people access the system. The flow of data is not
one way, however, because the database may be updated with information
from the user in the form of session data, statistics collection
(including voting), and user-submitted content such as comments or site
updates. In addition to the dynamic elements, there are also static
elements such as images, JavaScript code, and Cascading Style Sheets
(CSS).
 |
Variations on LAMP
LAMP started out as strictly Linux,
Apache, MySQL, and PHP or Perl. It is not uncommon, however, to run
Apache, MySQL, and PHP on Microsoft® Windows® if Linux isn't your
strength. Then again, you can always swap out Apache for something like
lighttpd, and you still have a LAMP-style system, albeit one with an
unpronounceable acronym. Or you may prefer a different open source
database such as PostgreSQL or SQLite, a commercial database such as
IBM® DB2®, or even a commercial but free engine like IBM DB2 Express-C.
This article focuses on the
traditional LAMP architecture because it's the one I see most often in
my travels, and its components are all open source.
|
|
After looking at the flow of requests through the
LAMP system, you can begin to see the points where slowdowns might
occur. The database provides much of the dynamic information, so the
client notices any delay in responding to queries. The Web server must
be able to execute the scripts quickly and also handle multiple
concurrent requests. Finally, the underlying operating system must be
in good health to support the applications. Other setups that share
files between different servers over the network can also become a
possible bottleneck.
Measuring performance
Constant measurement of performance helps in two
ways. The first is that measurement helps you spot trends, both good
and bad. As a simple example, by watching central processing unit (CPU)
usage on a Web server, you can see when it is overloaded. Similarly,
watching the total bandwidth used in the past and extrapolating to the
future helps you determine when network upgrades are needed. These
measurements are best correlated with other measurements and
observations. For example, you might determine that when users complain
of application slowness, the disks happen to be operating at maximum
capacity.
The second use of performance measurements is to
determine if tuning has helped the situation or made it worse. You do
this by comparing measurements before and after the change is made. For
this to be effective, though, only one item should be changed at a
time, and the proper metric should be compared to determine the effect
of the change. The reason for changing only one thing at a time should
be obvious. After all, it is quite possible that two simultaneous
changes could counteract each other. The reason for the metrics
statement is more subtle.
It is crucial that the metrics you choose to watch
reflect on the user of the application. If the goal of a change is to
reduce the memory footprint of the database, eliminating various
buffers will certainly help, at the expense of query speed and
application performance. Instead, one of the metrics should be
application response time, which opens up tuning possibilities other
than just the database's memory usage.
You can measure application response time in many
ways. Perhaps the easiest is with the curl
command shown in Listing 1.
Listing 1. Using cURL
to measure the response time of a Web site
$ curl -o /dev/null -s -w %{time_connect}:%{time_starttransfer}:%{time_total}\ http://www.canada.com 0.081:0.272:0.779
|
Listing 1 shows the curl
command being used to look up a popular news site. The output, which
would normally be the HTML code, is sent to /dev/null
with the -o parameter, and -s
turns off any status information. The -w
parameter tells curl to write out some status
information such as the timers described in Table 1:
Table 1. Timers used
by curl
| Timer |
Description |
| time_connect |
The time it takes to establish the TCP
connection to the server |
| time_starttransfer |
The time it takes for the Web server to
return the first byte of data after the request is issued |
| time_total |
The time it takes to complete the request |
Each of these timers is relative to the start of
the transaction, even before the Domain Name Service (DNS) lookup.
Thus, after the request was issued, it took 0.272 - 0.081 = 0.191
seconds for the Web server to process the request and start sending
back data. The client spent 0.779 - 0.272 = 0.507 seconds downloading
the data from the server.
By watching curl data
and trending it over time, you get a good idea of how responsive the
site is to users.
Of course, a Web site is more than just a single
page. It has images, JavaScript code, CSS, and cookies to deal with. curl
is good at getting the response time for a single element, but
sometimes you need to see how fast the whole page loads.
The Tamper Data extension for Firefox (see the Resources section for a link)
logs all the requests made by the Web browser and displays the time
each took to download. To use the extension, select Tools
> Tamper Data to open the Ongoing requests window.
Load the page in question, and you'll see the status of each request
made by the browser along with the time the element took to load.
Figure 1 shows the results of loading the developerWorks home page.
Figure 1. Breakdown of
requests used to load the developerWorks home page
Each line describes the loading of one element.
Various data are displayed, such as the time the request started, how
long it took to load, the size, and the results. The Duration column
lists the time the element itself took to load, while the Total
Duration column shows how long all the sub elements took. In Figure 1,
the main page took 516 milliseconds (ms) to load, but it was 5101 ms
before everything was loaded and the entire page could be displayed.
Another helpful mode of the Tamper Data extension
is to graph the output of the page load data. Right-click anywhere in
the top half of the Ongoing requests window and select Graph
all. Figure 2 shows a graphical view of the data from Figure
1.
Figure 2. A graphical
view of requests used to load the developerWorks home page
In Figure 2, the duration of each request is
displayed in dark blue and is shown relative to the start of the page
load. Thus, you can see which requests are slowing down the whole page
load.
Despite the focus on page loading times and user
experience, it is important not to lose sight of the core system
metrics such as disk, memory, CPU, and network. A wealth of utilities
are available to capture this information; perhaps the most helpful are
sar, vmstat,
and iostat. See the Resources
section for more information about these tools.
Basic system tweaks
Before you tune the Apache, PHP, and MySQL
components of your system, you should take some time to make sure that
the underlying Linux components are operating properly. It goes without
saying that you've already stripped down your list of running services
to only those that you need. In addition to being a good security
practice, doing so saves you both memory and CPU cycles.
Some quick kernel tuning
Most Linux distributions ship with buffers and
other Transmission Control Protocol (TCP) parameters conservatively
defined. You should change these parameters to allocate more memory to
enhancing network performance. Kernel parameters are set through the proc
interface by reading and writing to values in /proc.
Fortunately, the sysctl program manages these
in a somewhat easier fashion by reading values from /etc/sysctl.conf
and populating /proc as necessary. Listing 2
shows some more aggressive network settings that should be used on
Internet servers.
Listing 2.
/etc/sysctl.conf showing more aggressive network settings
# Use TCP syncookies when needed net.ipv4.tcp_syncookies = 1 # Enable TCP window scaling net.ipv4.tcp_window_scaling = 1 # Increase TCP max buffer size net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 # Increase Linux autotuning TCP buffer limits net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 # Increase number of ports available net.ipv4.ip_local_port_range = 1024 65000
|
Add this file to whatever is already in /etc/sysctl.conf.
The first setting enables TCP SYN cookies. When a new TCP connection
comes in from a client by means of a packet with the SYN bit set, the
server creates an entry for the half-open connection and responds with
a SYN-ACK packet. In normal operation, the remote client responds with
an ACK packet that moves the half-open connection to fully open. An
attack called the SYN flood ensures that the ACK
packet never returns so that the server runs out of room to process
incoming connections. The SYN cookie feature recognizes this condition
and starts using an elegant method that preserves space in the queue
(see the Resources section
for full details). Most systems have this enabled by default, but it's
worth making sure this one is configured.
Enabling TCP window scaling allows clients to
download data at a higher rate. TCP allows for multiple packets to be
sent without an acknowledgment from the remote side, up to 64 kilobytes
(KB) by default, which can be filled when talking to higher latency
peers. Window scaling enables some extra bits to be used in the header
to increase this window size.
The next four configuration items increase the TCP
send and receive buffers. This allows the application to get rid of its
data faster so it can serve another request, and it also improves the
remote client's ability to send data when the server gets busier.
The final configuration item increases the number
of local ports available for use, which increases the maximum number of
connections that can be served at a time.
These settings become effective at next boot or
the next time sysctl -p /etc/sysctl.conf is
run.
Configure disks for maximum performance
Disks play a vital role in the LAMP architecture.
Static files, templates, and code are served from disk, as are the data
tables and indexes that make up the database. Much of the tuning to
follow, especially that pertaining to the database, focuses on avoiding
disk access because of the relatively high latency it incurs.
Therefore, it makes sense to spend some time optimizing the disk
hardware.
The first order of business is to ensure that atime
logging is disabled on file systems. The atime
is the last access time of a file, and each time a file is accessed,
the underlying file system must record this timestamp. Because atime
is rarely used by systems administrators, disabling it frees up some
disk time. This is accomplished by adding the noatime
option in the fourth column of /etc/fstab.
Listing 3 shows an example configuration.
Listing 3. A sample
fstab showing how to enable noatime
/dev/VolGroup00/LogVol00 / ext3 defaults,noatime 1 1 LABEL=/boot /boot ext3 defaults,noatime 1 2 devpts /dev/pts devpts gid=5,mode=620 0 0 tmpfs /dev/shm tmpfs defaults 0 0 proc /proc proc defaults 0 0 sysfs /sys sysfs defaults 0 0 LABEL=SWAP-hdb2 swap swap defaults 0 0 LABEL=SWAP-hda3 swap swap defaults 0 0
|
Only the ext3 file systems have been modified in
Listing 3 because noatime is helpful only for
file systems that reside on a disk. A reboot is not necessary to effect
this change; you only need to remount each file system. For example, to
remount the root file system, run mount / -o remount.
A variety of disk hardware combinations are
possible, and Linux doesn't always reliably detect the optimal way to
access the disks. The hdparm command is used
to get and set the methods used to access IDE disks. hdparm
-t /path/to/device performs a speed test that you can use
as a benchmark. For the most reliable results, the system should be
idle when you run this command. Listing 4 shows a speed test being
performed on hda.
Listing 4. A speed
test being performed on /dev/hda
# hdparm -t /dev/hda
/dev/hda: Timing buffered disk reads: 182 MB in 3.02 seconds = 60.31 MB/sec
|
As the test shows, the disks are reading data at
around 60 megabytes (MB) per second.
Before delving into some of the disk tuning
options, a warning is in order. The wrong setting can corrupt the file
system. Sometimes you get a warning that the option isn't compatible
with your hardware; sometimes you don't. For this reason, test settings
thoroughly before putting a system into production. Having standard
hardware across all your servers helps here too.
Table 2 lists some of the more common options.
Table 2. Common
options for hdparm
| Option |
Description |
| -vi |
Query the drive to determine which settings
it supports and which settings it is using. |
| -c |
Query/enable (E)IDE 32-bit I/O support. hdparm
-c 1 /dev/hda enables this. |
| -m |
Query/set multiple sectors per interrupt
mode. If the setting is greater than zero, up to that number of sectors
can be transferred per interrupt. |
| -d
1 -X |
Enable direct memory access (DMA) transfers
and set the IDE transfer mode. The hdparm man
page details the numbers that may go after the -X.
You should need to do this only if -vi shows
you're not using the fastest mode. |
Unfortunately for Fiber Channel and Small Computer
Systems Interface (SCSI) systems, tuning is dependent on the particular
driver.
You must add whichever settings you find useful to
your startup scripts, such as rc.local.
Network file system tuning
The network file system (NFS) is a way to share
disk volumes across the network. NFS is helpful to ensure that every
host has a copy of the same data and that changes are reflected across
all nodes. By default, though, NFS is not configured for high-volume
use.
Each client should mount the remote file system
with rsize=32768,wsize=32768,intr,noatime to
ensure the following:
- Large read/write block sizes are used (up to
the specified figure, in this case 32KB).
- NFS operations can be interrupted in case of a
hang.
- The
atime won't be
constantly updated.
You can put these settings in /etc/fstab,
as shown in Listing 3. If you
use the automounter, these go in the appropriate /etc/auto.*
file.
On the server side, it is important to make sure
there are enough NFS kernel threads available to handle all your
clients. By default, only one thread is started, though Red Hat and
Fedora systems start at 8. For a busy NFS server, you should push this
number higher, such as 32 or 64, to start. You can evaluate your
clients to see if there was blockage with the nfsstat -rc
command, which shows client Remote Procedure Call (RPC) statistics.
Listing 5 shows the client statistics for a Web server.
Listing 5. Showing a
NFS client's RPC statistics
# nfsstat -rc Client rpc stats: calls retrans authrefrsh 1465903813 0 0
|
The second column, retrans,
is zero, showing that no retransmissions were necessary since the last
reboot. If this number is climbing, then you should consider adding
more NFS kernel threads. This is done by passing the number of threads
desired to rpc.nfsd, such as rpc.nfsd
128 to start 128 threads. You can do this at any time.
Threads are started or destroyed as necessary. Again, this should go in
your startup scripts, preferably in the script that starts NFS on your
system.
A final note on NFS: Avoid NFSv2 if you can
because performance is much less than in v3 and v4. This is not an
issue in modern Linux distributions, but check the output of nfsstat
on the server to see if any NFSv2 calls are being made.
Looking ahead
This article covered some of the basics of LAMP
and looked at some simple Linux tuning for LAMP installations. With the
exception of NFS kernel threads, you can set and then ignore the
parameters discussed in this article. The next two articles in this
series focus on Apache, MySQL, and PHP tuning. Tuning them is much
different than tuning Linux because you need to constantly revisit the
parameters as the traffic volumes increase, the read/write
distributions change, and the application evolves.
Resources
Learn
- "Easy
system monitoring with SAR" (developerWorks, February 2006)
is a guide to keeping track of key system metrics using
sar.
- "Expose
Web performance problems with the RRDtool" (developerWorks,
March 2006) is a tutorial by Sean that expands on the cURL technique
and graphs the data for long-term analysis.
- In Monitoring
Virtual Memory with vmstat (Linux Journal, October 2005), you
learn how to observe paging activity on your Linux system.
- If NFS is new to you, To
Protect and Serve: Providing Data to Your Cluster (Prentice
Hall Professional Technical Reference, February 2005) is a good
introduction to both NFS and the automounter, which helps in
large-scale NFS deployments.
- "TCP and
Linux' Pluggable Congestion Control Algorithms" (Linux
Gazette, February 2007) discusses how to try out the many algorithms to
implement TCP congestion control that Linux supports, and, more
importantly, the importance and impact of loss and delay on your
network sessions.
- TCP SYN cookies
were mentioned earlier as a defense against denial of service attacks
involving SYN floods. Wikipedia describes their implementation. It's a
brilliant idea.
- In the developerWorks
Linux zone, find more resources for Linux developers.
- Stay current with developerWorks
technical events and Webcasts.
Get products and technologies
- The Tamper Data
extension for Firefox allows you to view and modify HTTP headers on the
fly and to graph the loading of page elements.
- Order
the SEK for Linux, a two-DVD set containing the latest IBM
trial software for Linux from DB2®, Lotus®, Rational®, Tivoli®, and
WebSphere®.
- With IBM
trial software, available for download directly from
developerWorks, build your next development project on Linux.
Discuss
About the author
 |
|

|
 |
Sean
Walberg has been working with Linux and UNIX since 1994 in academic,
corporate, and Internet service provider environments. He has written
extensively about systems administration over the past several years.
|
|
 |