Microsoft
SQL Server &
Solid State Accelerators
by Jeffrey R. Garbus
Soaring Eagle Consulting, Ltd
Executive
Summary
Solid
State Accelerators make it possible for you to give your peripheral
storage the speed of memory. Essentially, Solid State technology is
memory with an attitude that emulates a conventional disk drive. It
includes a built-in UPS backup system that allows the data to be
copied to a built-in disk drive in the event of a power
interruption.
Most
likely, you are reading this paper for one of a very few specific
reasons:
- You
have tuned your database for peak performance, queries are running
optimally, yet due to transaction or query volumes, processing needs
are exceeding the hardware capabilities, and you need to explore
other options;
- You
do not have time before rollout to fully examine your performance
problems, but know that you have them, and are seeking a quick and
permanent hardware solution, which does not require changes to the
server or the code;
- You
know you have an I/O bottleneck, perhaps even know where, and want
to know specifically how (or if) Solid State technology can help
you.
In
this paper, we will discuss when Solid State is the correct solution
to your MS SQL Server performance problem, and specifically how to
implement it.
When
is Hardware the Answer?
Generally,
people in the software industry like to look to software for
solutions. Many experienced DBAs, this author included, feel a sense
of defeat if the application cannot be made to perform suitably
within the constraints of the existing hardware configuration.
This
attitude is silly, of course. You can tune your application within
limits, but you can’t overcome hardware limitations with software
tuning. Sometimes the software does point to hardware limitations
(there is an example below) but many times an experienced DBA just
has to know when to stop tinkering with the software and look at
hardware factors.
The
process of tuning is a process of identifying and eliminating
bottlenecks. By definition, you will always have bottlenecks that
limit your application’s bandwidth. Tuning is the process of
successfully shifting
bottlenecks to places that have a wide enough bandpass to handle
your data throughput needs. Sometimes you need a solution to a
database performance problem that can’t be resolved through
software.
The
traditional reaction to a perceived hardware problem is to buy
additional hardware – most commonly memory and/or CPUs. I have
been in a variety of shops that had purchased additional CPUs (4 to
8 processors), or more memory (4GB to 8GB), and not seen a
significant improvement in performance. Often, adding CPUs is a
knee-jerk reaction to a perception of insufficient user throughput.
More often than not, adding memory is perceived as the panacea for
I/O problems, with the idea that more memory permits more caching
which improves performance.
There
are a number of reasons why this might be the case, but all boil
down to this: If you buy memory or CPUs to solve an I/O problem, you
are likely to be wasting your money. (See Mike Pluta’s white
paper, “The Tragedy of throwing Memory at an I/O Problem,”
(available at http://www.imperialtechnology.com/technology_whitepapers_tragedy.htm).
To summarize this insightful paper, adding memory may alleviate
symptoms, on the surface, but it’s really masking rather than
solving the underlying problem. To make matters worse, whenever the
server or operating system decides that it needs that memory for
other things, your bottleneck gets shifted, suddenly and
unpredictably. In fact, it may be shifting back and forth many times
in a short interval, making tracking down the bottleneck difficult.
The
trick is to solve the specific bottleneck, so those individual
problems get resolved and are scalable, so that as your processing
needs increase, you can use the same (or comparable solution) to
resolve future bottlenecking issues. That is to say, once you
identify a problem, you want to solve that problem, not hide it.
In
summary, there are two basic reasons to choose a hardware solution
to solve a performance problem.
1.
You have specifically identified a hardware bottleneck, in
which case you should be choosing a hardware resource that addresses
that problem (processor, memory, or Solid State technology).
2.
You have identified an application problem, and have
determined that you are short on time or resources to correct it
(alternately, that hardware will solve the problem less expensively
in the short run).
This
remainder of this paper focuses on using Solid State technology to
resolve I/O performance bottlenecks.
Identifying
Your Hardware Bottleneck
The
easiest way to understand your hardware problem is to use the
MS-provided tools, specifically Performance Monitor. Using
Performance Monitor, we can identify issues in each of a set of
performance objects. The specific contents of the performance
objects will differ based on the version of NT / Windows you are
running.
Regardless
of your operating system version, you will want to monitor your
physical disks (or disk suites). Note that monitoring the disks will
create some system overhead. Historically, this has been an increase
of 3-5% of CPU, but with later versions of NT/Windows 2000 this
monitoring seems to have a lesser impact. You’ll have to turn it
on specifically… at a command prompt, type diskperf –y. Then, reboot. Be sure to turn it back off when you’re
done with your hardware performance monitoring (and, reboot!).
Each
disk on your system will have a separate disk object. This gives you
the ability to understand, for each object, the frequency at which
it is being accessed.
For each
physical disk object, you can look at %
disk time, which will tell you how busy the disks are. Note that
you may be hardware bound before you hit 100% busy. Here’s a
sample:

[click
here for enlarged graph]
Here
we’re seeing a few spikes, but in general the sampling points are
showing very low output. If you are seeing a line across the top,
rather than across the bottom, you have an I/O device that is not
keeping up with the system requests. In general, if you are above
70-75% utilization regularly, you do not have sufficient capacity to
handle peak data surges.
Additionally,
check out your Avg. Disk Queue
Length. This should always be less than 1.0 and usually is zero.
As this average queue length number increases, disk
contention increases. Contention means that the system is waiting
while another I/O is accessing the disk. Occasional waits are
normal, but if this is a steady thing, you have an I/O bottleneck.
Note:
You can take an indirect conclusion from the fact that performance
seems slow, but CPU utilization is low… now, this might be locks
or contention, but I/O is a likely culprit.
This is a clear indication to check your disk queue lengths.
You
can look at other indicators, but these should be sufficient to
identify the problem.
Solving
the Specific Problem
Once
you have determined that you have an I/O bottleneck that needs to be
solved, you need to decide on the best approach. It would be easy to
replace the entire disk subsystem with a Solid State Accelerator,
but this may not be needed. Typically,
just a portion of the data on the disk would need to be placed on
Solid State in order to provide significant performance benefit. The
key is identifying the right subsystem and files to place on the
Solid State Accelerator that will give your stressed system the I/O
relief it needs.
Equally
as important is determining the amount of Solid State that you need
to provide the desired result, i.e. Solid State translates directly
to replacing conventional rotating disk. You want to choose the most
active files to place on the Solid State Accelerator, of course, to
take the strain off of the disks where the I/O bottlenecks are the
worst. Good examples of files to put on Solid State include:
- Tempdb,
a very frequent bottleneck for complex online applications as
well as decision support applications that use aggregate
functions (and hence, tempdb). Tip: You may have 5GB of tempb,
but except during the busiest times use only 1-2GB … in this
case, you can configure the first device fragments of tempdb to
exist on the Solid State, and overflow onto other areas. We
recommend that you do not use the “autogrow” feature here,
as it will have a negative performance impact just when your
system is the busiest;
- Transaction
logs of very busy (write-intensive) databases (same tip here; you do
not need to do this for your entire transaction log area, but to put
it in the front can be very effective);
- Databases
that are hit very heavily, which might be catalog information,
lookup tables, or other information that is hit constantly, but
perhaps is too small to be worth spreading out, are also excellent
candidates. Note that while cache may work well with this, all it
might take is a large table scan to flush all of the useful,
accumulated pages from memory.
High
Availability Solid State Accelerators have built-in UPS systems that
make them non-volatile. In
the unlikely event of a power fluctuation, the built-in UPS powers
the system and the built-in hard disk drive is used to effectively
backup the contents of the solid state memory.
When stable power is re-established, user data from the disk
drive is reload to the solid state memory
Cool
Options for Maximizing your Hardware Dollar
What
happens when your needs change on a predictable basis? For instance,
you may have high-volume OLTP processes running from 8 to 5, and
huge batch processing in the evening. Your bottleneck shifts twice a
day.
The
concept of Storage Virtualization in a Storage Area Network (SAN)
gives you the ability to resolve transitional bottlenecks almost
on an on-demand basis.
For
example, let’s assume you have 5GB of Solid State in a SAN. During
the day, you might place tempdb on the Solid State to eliminate your
I/O bottleneck. At night, you might re-allocate the Solid State to
address your month-end batch processing, which has gotten to the
point where it is stretching the batch update window beyond it’s
limits.
Managing
the SAN, you mirror tempdb off of the Solid State Accelerator onto
disk for this processing, and mirror the financial database onto it
for the batch processing. When this finishes, mirror in the other
direction and everything is set for the next business day.
You’ll
note that this is all done in such a way as to be completely
transparent to the servers and end-users. Suddenly, the Solid State
becomes a reusable resource, with one Accelerator able to support
multiple environments and the capacity able to be allocated as
needed to boost performance in multiple application environments.
Further benefit can be attained by using a larger unit and
dedicating some of it to fixed data and have the other space
available for hotspot usage.
Example
Let’s take a look at a client of mine who had problems reaching 100GB/week throughput using 6.5 SQL Server. Their batch processing made very heavy use of tempdb. Because of assorted high-volume processing needs, the SAN option would not work for them.
This client struggled for weeks to tune the system, when the solution couldn’t have been simpler. Based on measured throughput, a small 2GB Solid State Accelerator for tempdb, and a 10GB MegaCache in front of their 400GB production database could have solved 100% of their throughput problems instantly.
Conclusion/Summary
Have
you ever had a problem with I/O, tried spreading it over multiple
devices, tried rewriting procedures or redesigning your database and
still couldn’t get there? Or perhaps, just got there and know that the problem is going to crop up
again? Maybe it is tempdb (a frequent culprit). Maybe in your
environment, your code tables are simply getting hit so hard by your
busy web application that the disk simply can’t keep up.
I’ve
watched several (and heard of dozens more) situations where I/O
limitations really were the problem, but folks wouldn’t admit it
until the application was tuned and retuned. At that point, they
bring in Solid State and wonder why they didn’t do it sooner.
Solid
State is a fast, clean, easy solution to problems that sometimes
require time-consuming, expensively maintained resource juggling.
Sometimes,
the quick, easy solution really is cheaper in the long run, when you
weigh in development time and ongoing maintenance costs.
There
is a very specific set of problems that Solid State addresses, but
it solves them completely. Make sure that you understand the
performance problem you are trying to solve.
Once you understand this, you may notice that Solid State
moves from “interesting idea” to “essential, cost-effective
tool for solving specific performance problems” very quickly.
About
the Author
Jeff’s
background includes a B.S. Degree from Rensselaer Polytechnic
Institute, and work experience from PC's to Mainframes and back
again. Jeff has many years of client / server, Sybase, and Microsoft
SQL Server experience, with a special emphasis on assisting clients
in migrating from existing systems to pilot and large-scale
projects. He is very
well known in the industry, having spoken at user conferences and
user groups1 for many years, written articles and
columns for many magazines2 nationally and
internationally, as well as having written 11 books3.
Recently his focus has been on Very Large Databases, Business Rules
Architecture, and Data Warehousing. Jeff
has been in consulting for fifteen years, training for ten, and in
the software business for twenty. He
has a demonstrated talent for staying at the leading edge of
technology, as well as transferring his knowledge to others. He can
be reached at jeffg@soaringeagleltd.com,
or via his company’s website www.soaringeagleltd.com
1
Sybase International Users' Group, MIS Training Conference,
Microsoft internal development conference, Information User
Association, among others.
2
DM Review,” Sybase Corner of PowerBuilder Development Journal,
European Business Report, SQL Forum Journal, Boxes and Arrows
magazine, among others
3
Including “Microsoft 6.0 Unleashed,” “Microsoft 6.5
Unleashed” The 7.0 exam cram series, “Optimizing SQL Server
7”, “Learn SQL Server 7.0 Administration,” “Learn SQL Server
2000 Administration”
Get
PDF File
|