Performace aplikasi menjadi lambat? siapa yang salah.
Kalau kita bicara masalah database performance, pasti kita bicara data production yang berjumlah megaan byte dan diakses oleh hampir semua user dan menjadi core business dari system aplikasi yang ada. kendala performance di hadapi oleh sebagian besar aplikasi ERP yang dikembangkan dengan konsep terintegrasi dan diakses oleh multi user.
Faktor faktor utama yang menyebabkan performance aplikasi dan database kita menjadi lambat diantaranya disebabkan oleh :
Arsitektur dari aplikasi yang di buat
Desain dari aplikasi yang di buat
Penggunaan Isolation Level yang tidak tepat
kurang tepatnya dalam penggunaan script TSQL
Masalah Hardware resource
SQL Server Configuration
Namun dari hasil pengamatan ternyata faktor utama dan yang paling berpengaruh dalam urusan performance tunning yang memberikan impact cukup besar untuk menaikan performance adalah tunning di Application design, Database Design, SQL (TSQL) , Hardware Dan perubahan dari Workload di performance monitoring. seperti tampak pada table dibawah ini
Application Design : 25%
Database Design : 25%
SQL : 20%
hardware tunning : 10%
performance monitoring Workload change : 20%
hampir sebagian besar masalah performace akibat kesalahan pada saat design aplikasi dan database yang tidak memikirkan faktor dan best practise khususnya dalam membuat design aplikasi yang memperhatikan juga sisi performance
pada kenyataannya pemahaman tentang perlunya design yang mempertimbangkan performace nyaris tidak di pikirkan oleh kalangan pengembang software, apalagi sebagian besar dari mereka membuat software dari suatu aplikasi tumbuh. sehingga kita tidak tahu kapasitas aplikasi kita seperti apa, kemampuan availabilitynya seperti apa bahkan kita tidak tahu kapan harus mengganti hardware atau membackup data dari database yang ada.
Kamis, 08 Januari 2009
SQL Server Performance by Yulius Eka Agung Seputra,ST,MSi
PERFORMANCE MONITOR
Performance Monitor collects data about different counters, such as memory use. Performance Monitor can show you data in graphical format in real time, or you can save the data to log files. Pay particular attention to the discussion of log files in this section, because you will also use log files for long-term performance monitoring, which we will discuss in detail later. Working with a log file can be difficult to learn on your own because the options are not intuitive and are hidden on different screens.
You can choose between two Performance Monitors: one in the Administrative Tools group and one in the SQL Server group. They are the same basic program, but you need to run the one in the SQL Server group because it automatically loads the SQL Server-related counters. You run this version of the program with the following command: Perfmon.exe C:\Mssql\Binn\ Sqlctrs.pmc, where the .pmc file is the Performance Monitor counter file that contains the SQL counters. You can write applications that provide your own counters, and you can modify the new system stored procedures called sp_user_counter1 through sp_user_counter10 and track them, too.
When you run the program from the SQL Server group, the window in Figure 15.1 appears. Because you started the program that includes the set of SQL counters, five counters appear at the bottom of the window when Performance Monitor starts. The five counters are
Cache Hit Ratio
I/O — Transactions per second
I/O — Page Reads per second
I/O Single Page Writes per second
User Connections
These counters will be explained in more detail later, but first, let’s learn how to navigate in Performance Monitor.
Changing Menu Options
The first set of buttons on the tool bar at the top of the window corresponds to the four views of the monitor: chart, alert, log, and report views. You can get to the same options using the View menu.
The menu options change depending upon which view is currently active. Without going into too much detail about the View menu options, their basic purpose is to let you set up and save standard viewing templates for each of the four views.
Understanding Counters
Windows NT lets you watch the performance of the system by “counting” the activity associated with any of its objects. Examples of objects in Windows NT are processors, disk drives, and processes. Each object has specific counters associated with it; for example, the % User Time counter is associated with a CPU or processor to designate what percent of the CPU is taken up by user programs (as opposed to system processes). This chapter gives you enough information to help you choose the right counters at the right time.
SQL Server includes many predefined counters, most of which you aren’t likely to use except in special cases. It can be difficult to know which counters are the basic ones to watch. If you have chosen the SQL Server Performance Monitor, several counters have been set up as default counters, such as Cache Hit Ratio and User Connections. You can create your own defaults by creating a .pmc file.
The counters are hooks into the operating system and other programs, like SQL Server, that have been built into the software to let Performance Monitor get data. Data collection is performed efficiently so that the additional load on the system is minimized. Windows NT needs most of the information gathered for managing memory, processes, and threads, and Performance Monitor is a good program to display the results.
On the tool bar in Figure 15.1, the button next to the four view buttons at the top of the window is a big plus sign, which you use to add counters to monitor. Click the + button, and the window in Figure 15.2 will appear. The first field, Computer, has a search button at the end of the field. You can click on this field to bring up a list of all computers in your domain and choose a computer from the list, or you can type the name of a server you want to monitor. To monitor other servers, you need Windows NT administrative privileges on them.
In the next field, Object, you choose an object to monitor. The default is Processor, and the default counter shown in the field below is % Processor Time. The box on the right is the Instance. Any particular resource may have more than one instance ; that is, more than one of that particular resource — in this case, processors — may exist. This computer has only one processor (CPU) because the instance in the box is 0. Instance 3 refers to the fourth CPU.
From the fields along the bottom, you can pick the color, scale, line width, and line style of the information that will be displayed about the counter you are adding. These options let you choose a different look for each counter you add to the window. The only display choice that may need explanation is scale. The scale field is a multiplier that helps you fit the values on the screen in the range you have set on the y-axis, which by default is 0 –100.
After you choose the Object, Counter, and Instance you want to monitor and determine how you want the information to appear, click Add. The counter is added at the bottom of the list on the main window (Figure 15.1) and starts graphing the next time your data is refreshed.
If you click the Explain button in Figure 15.2, a brief explanation of the counter you specified will appear (Figure 15.3). Sometimes, though, it uses abbreviations and acronyms that require further research, unless you are a Windows NT internals guru.
Setting Up Alerts
An alert is the warning the computer sends you when a resource such as memory or the network becomes a bottleneck. When an alert occurs, it is written to a log file, along with the date and time it occurred. The log file is a circular file, allowing at most 1,000 entries before it starts overwriting the oldest alerts. The alert can also be written to the Windows NT event log.
To add a new alert, click the second button on the toolbar in Figure 15.1, then click the + button. The dialog box shown in Figure 15.4 will appear. Choose the counters you want to create alerts for, then click Add. The example on the screen will create an alert when the Cache Hit Ratio drops below 85 percent.
Notice the Run Program option in the lower right portion of the screen. You can use it to execute a program when the alert occurs. For example, you can choose SQL Server — Log in the Object field, Log Space Used (%) for the Counter, and the database you want to monitor from the Instance list. When the log file for that database gets above 90 percent, you can execute a batch file that runs an ISQL script to dump the transaction log. In this way you can reduce your chances of running out of log space.
Starting Log Files
Learning how to establish log files is very important, because log files are a critical part of the long-term strategy recommended later in this chapter. It can be a bit confusing, so let’s go through the steps.
Click the third button on the toolbar — View Output Log File Status. Notice that the Log File entry at the top is blank, the status is closed, the file size is zero, and the log interval is 15.00 (seconds).
Click +, and the list of objects shown in Figure 15.5 will appear. Select the ones you want to add to the log and click Add. If you hold down the Ctrl key while selecting, you can choose more than one counter, and holding down Shift lets you highlight all the items in a range. All counters in the objects you pick will be tracked in the log file. We will discuss what to monitor later.
Now we need to specify a log file. From the Options menu, choose Log. The dialog box shown in Figure 15.6 will appear.
This dialog box looks almost like the standard file dialog box, but it has two very important additions. At the bottom of the screen, the Update Time section shows the refresh interval. For short-term tracking, keep it at 15 seconds. For long-term tracking, set it at 300 seconds (5 minutes). The other important difference between this dialog box and the standard file name dialog box is the Start Log button. Nothing happens until you click this button to start collecting data. Once you do, the text of the button will change to Stop Log.
Type a log file name in the File Name box at the top. Then click Start Log.
Click OK to close this dialog box, then minimize the window and let the log run for a while.
Maximize the window and click the Stop Log button. Then switch to the Chart view by clicking the first button on the toolbar.
From the Options menu, choose Data From. Select the log file you named earlier. You can then choose the counters you want to view from the log.
The best part about using log files is that you can view a few counters at a time to avoid overcrowding the window. You can also mix and match the counters you want to analyze at the same time. This feature is important because many of the counters depend on other counters.
Special Note: The log file does not do anything until you click the Start Log button in the Log Options dialog box (also available by choosing Log in the Options menu).
Reports
The fourth button on the toolbar, the Reports button, lets you print customized reports of the data collected in your log file. Experiment with the available reports when you have a chance; we won’t cover this option here.
TIME INTERVALS
The default refresh interval for Performance Monitor is one second. Every second, you get new information about your system’s performance. This interval is good for a very short-term examination of the system, but it can be a drain on the server. A five-second interval causes much less overhead, probably in the neighborhood of five percent extra activity. However, for long-term monitoring, 5 seconds produces a very large log file.
Setting the interval to 5 minutes creates a reasonable size log file, but this large an interval can mask performance peaks. However, because each entry in the log file stores the minimum, average, and maximum values for each counter, or aspect of SQL Server you want to monitor, you can discover the peaks with a little extra analysis. Five minutes is a good setting for long-term logging. You can always fire up another copy of Performance Monitor and look at one- to five-second intervals if you want a short-term peek into the system’s performance.
To determine the amount of drain on the system from Performance Monitor, shut down all the services and start the Monitor again. Add the CPU usage and watch it for about 30 seconds at the default interval of one second. Then change the interval to 0.1 seconds. Your CPU usage will jump dramatically. One odd observation is that the effect of changing from one second to 0.1 seconds is different on different computers, and it is different between Windows NT 4.0 and Windows NT 3.51. For example, when changing the interval on two 133 MHz computers — a laptop and a tower box — the tower machine has the better performance at the shorter interval, showing about 55 percent utilization, while the laptop shows about 60 percent utilization.
Special Note: The faster your refresh option, the more the drain on the system. The default one-second refresh interval creates less than 5 percent overhead on a single-processor machine. For multiprocessor machines, the overhead is negligible. With the refresh interval set to 0.01 seconds, Performance Monitor takes about 60 percent of the resources. At 10 seconds per refresh, the drain is almost too small to measure, even with a lot of counters turned on.
WHAT TO MONITOR
Now that you know how to use the program, let’s get to the section you’ve been waiting for: How do you know what to monitor? Of the hundreds of Windows NT counters and 50 or so SQL counters, how do you choose? Should you monitor everything? How long should you monitor the system?
Monitoring performance helps you perform two related tasks: identifying bottlenecks and planning for your future hardware and software needs (capacity planning). Learning about the important counters will help identify potential bottlenecks. The strategy section later in this chapter will help you put together a good plan for creating a general monitoring strategy.
What do you want to monitor? Everything! Well, monitoring everything may be a good idea for a short period, but the results will show that many of the counters are always at or near zero; monitoring them all the time may be a waste of time and resources. You need to establish a baseline for your system. This baseline lets you know what results are normal and what results indicate a problem. Once you establish a baseline, you don’t need to track everything.
The key categories to monitor can be split into two major sections: Windows NT categories and SQL Server categories. Categories in this sense are groups of objects that contain counters.
Windows NT
Memory
Processor
Disk I/O
Network
SQL Server
Cache
Disk I/O
Log
Locks
Users
Other Predefined Counters
User-Defined Counters
When monitoring both categories of data, look for trends of high and low activity. For example, particular times during the day, certain days of the week, or certain weeks of the month might show more activity than others. After you identify highs and lows, try to redistribute the workload. These peaks and valleys are especially good to know when something new is added to the schedule. If the peak loads are causing problems, identify which things can be scheduled at a later time when the system is not so busy. Knowing the load patterns is also helpful when problems occur, so that you can re-run a particular job or report when the load is low.
Get to know your users — find out which reports they need first thing in the morning. Perhaps you can schedule these reports to run at night in a batch mode, instead of having the user starting them during a busy time.
Monitoring Windows NT
The purpose of monitoring the Windows NT categories is to answer one of two questions: “What resource is my bottleneck?” or “Do I see any upward usage trends that tell me what resource I might run low on first?” SQL Server 6.5 introduced several highwater markers, such as Max Tempdb space used, which make it easier to identify potential long-term problems.
Memory
The Memory: Pages/sec counter is the number of pages read or written to the disk when the system can’t find the page in memory. This page management process is referred to as paging. If the average value for this counter is five, you need to tune the system. If this value is 10 or more, put tuning the server high on your priority list. Before SQL Server 6.0, the value for this counter was an important flag to tell you whether memory was the bottleneck. Now, with SQL Server’s parallel read-ahead feature, this counter will give you only an indication of how busy the read-ahead manager is. However, we will discuss other counters that are better at tracking the read-ahead manager. In other words, this counter may have been one of the most significant counters to track in the past, and it still is on machines without SQL Server, but better ones are available to track memory.
The Memory: Available Bytes counter displays the amount of free physical memory. If the value for this counter is consistently less than 10 percent of your total memory, paging is probably occurring. You have too much memory allocated to SQL Server and not enough to Windows NT.
Processor
Before we start talking about the counters in the processor category, it is important to know that Windows NT assigns certain responsibilities to certain processors if you have four or more CPUs. Processor 0 is the default CPU for the I/O subsystem. Network Interface Cards (NIC) are assigned to the remaining CPUs, starting from the highest-numbered CPU. If you have four processors and one NIC, that card is assigned Processor 3. The next NIC gets Processor 2. Windows NT does a good job of spreading out processor use. You can also set which processors SQL Server uses. See Chapter 16, “Performance Tuning,” particularly the notes on the Affinity Mask, for more information about allocating processors.
You can monitor each processor individually or all the processors together. For monitoring individual processors, use the Processor: % Process Time counter. This counter lets you see which processors are the busiest.
A better counter to monitor over the long term is the System: % Total Processor Time counter, which groups all the processors to tell you the average percentage of time that all processors were busy executing non-idle threads.
Who (or what) is consuming the CPU time? Is it the users, system interrupts, or other system processes? The Processor: Interrupts/sec counter will tell you if it is the system interrupts. A value of more than 1,000 indicates that you should get better network cards, disk controllers, or both. If the Processor: % Privileged Time is greater than 20 percent (of the total processor time) and Processor: % User Time is consistently less than 80 percent, then SQL Server is probably generating excessive I/O requests to the system. If your machine is not a dedicated SQL Server machine, make it so. If none of these situations is occurring, user processes are consuming the CPU. We will look at how to monitor user processes when we consider SQL Server-specific counters in the next section.
Disk I/O
As discussed in Chapter 16, “Performance Tuning,” having many smaller drives is better than having one large drive for SQL Server machines. Let’s say that you need 4 GB of disk space to support your application with SQL Server. Buy four 1-GB drives instead of one 4-GB drive. Even though the seek time is faster on the larger drive, you will still get a tremendous performance improvement by spreading files, tables, and logs among more than one drive.
Special Note: The single best performance increase on a SQL Server box comes from spreading I/O among multiple drives (adding memory is a close second).
Monitor the disk counters to see whether the I/O subsystem is the bottleneck, and if it is, to determine which disk is the culprit. The problem may be the disk controller board. The first thing to know about monitoring disk I/O is that to get accurate readings from the Physical Disk counters, you must go to a command prompt window and type DISKPERF -y, then reboot. This procedure turns on the operating system hooks into the disk subsystem. However, this setup also causes a small performance decrease of 3 to 5 percent, so you want to turn this on only periodically and only for a short period. Use the Diskperf -n command to turn it off, then restart your system.
Track Physical Disk: % Disk Time to see how much time each disk is busy servicing I/O, including time spent waiting in the disk driver queue. If this counter is near 100 percent on a consistent basis, then the physical disk is the bottleneck. Do you rush out and buy another disk? Perhaps that is the best strategy if the other drives are also busy, but you have other options. You may get more benefit from buying another controller and splitting the I/O load between the different controllers. Find out what files or SQL Server tables reside on that disk, and move the busy ones to another drive. If the bottleneck is the system drive, split the virtual memory swap file to another drive, or move the whole file to a less busy drive. You should already have split the swap file, unless you only have one drive (which is very silly on a SQL Server machine).
LogicalDisk: Disk Queue Length and PhysicalDisk: Disk Queue Length can reveal whether particular drives are too busy. These counters track how many requests are waiting in line for the disk to become available. Values of less than 2 are good; if the value is any higher, it’s too high.
Network
Redirector: Read Bytes Network/Sec gives the actual rate at which bytes are being read from the network. Dividing this value by the value for the Redirector: Bytes Received/Sec counter gives the efficiency with which the bytes are being processed.
If this ratio is 1:1, your system is processing network packets as fast as it gets them. If this ratio is below 0.8, then the network packets are coming in faster than your system can process them. To correct this problem on a multiprocessor system, use the Affinity Mask and SMP Concurrency options in the SQL Configuration dialog box to allocate the last processor to the network card, and don’t let SQL Server use that processor. For example, if you have four CPUs, set the Affinity Mask to 7 (binary 0111) and SMP Concurrency to 3. This setup gives three CPUs to SQL Server and the fourth processor to the network card, which Windows NT assigns to that processor by default. If I/O is also a problem, set the Affinity Mask to 6 (binary 0110) and SMP Concurrency to 2, because Windows NT assigns the I/O subsystem to the first processor by default.
Monitoring SQL Server
The questions to ask yourself when monitoring the SQL Server categories are “Do I have the optimal configuration values for SQL Server?” and “Who is consistently using the most resources?”
If any of the counters considered in this section indicate a problem, the problem is somewhere related to SQL Server. If the problem is I/O, memory, CPU, or locks, you can dig deeper and find out who the culprits are. However, if you are using a long-term logging strategy for monitoring, you must monitor every session to be sure you have the necessary historical data when you want to see what was happening at a particular time.
If you are watching the monitor when a problem occurs, go to the SQL Server-Users object and turn on the counter for all instances. The instances in this case are the sessions currently logged on. You can see the login ID and the session number. If you see one or more sessions causing the problem, you can spy on them to find the last command sent. Go to the Enterprise Manager, click the Current Activity button on the toolbar, and double-click the line in the display corresponding to the session number. You will see the last command received from the session. To trace commands in more depth, use the SQLTrace utility that is new with version 6.5. (See Chapter 3, “Administrative and Programming Tools,” for details.)
The five main categories of SQL Server counters to monitor are cache, disk I/O, log, locks, and users. We will consider each of these categories separately as well as a mix of other important predefined counters. The final part of this section discusses the new user-defined counters.
Cache
To monitor your cache, watch SQL Server — Cache Hit Ratio. It monitors the rate at which the system finds pages in memory without having to go to disk. The cache hit ratio is the number of logical reads divided by the total of logical plus physical reads. If the value for this counter is consistently less than 80 percent, you should allocate more memory to SQL Server, buy more system memory, or both. However, before you buy more memory, you can try changing the read-ahead configuration options. Also look at the discussion of free buffers in the next chapter to determine whether the number of free buffers is approaching zero. Changing the free buffers configuration parameter may increase the cache hit ratio.
To find out if you have configured SQL Server properly, you should monitor SQL Server-Procedure Cache: Max Procedure Cache Used (%). If this counter approaches or exceeds 90 percent during normal usage, increase the procedure cache in the SQL Server configuration options. If the maximum cache used is less than 50 percent, you can decrease the configuration value and give more memory to the data cache. Rumor has it that SQL Server 7.0 will have a floating-point number for the procedure cache configuration parameter so that you can give the procedure cache less than 1 percent of your SQL Server memory. For a super server with gigabytes of memory, even 1 percent is too much for procedure cache.
If a 2K data page has been swapped to the Windows NT virtual memory file and read in again later, SQL Server still counts the page as already in memory for the purposes of the Cache Hit Ratio counter. Therefore, a system bogged down by heavy swapping to virtual memory could still show a good cache hit ratio. To find out if your system is in this category, monitor the Memory: Page Faults/Sec counter.
The Memory: Page Faults/Sec counter watches the number of times a page was fetched from virtual memory, meaning that the page had been swapped to the Windows NT swap file. It also adds to the counter the number of pages shared by other processes. This value can be high while system services, including SQL Server, are starting up. If it is consistently high, you may have given too much memory to SQL Server. The network and operating system may not have enough memory to operate efficiently.
Warning: This counter is a strange one to figure out. Running this counter on four different types of machines gave widely different results. To try to get a baseline value, we turned off all services, including SQL Server, unplugged the boxes from the network, and ran Performance Monitor with only the Memory: Page Faults/Sec counter turned on. The lowest measurement of page faults per second was from the system we least expected — a 50 MHz 486 with 16 MB of memory and one disk drive. It settled in at about five to seven page faults per second. The DEC Alpha with 4 processors, 10 GB RAID 5 striping on 5 drives, and 256 MB of memory was up in the 35 to 40 page faults per second range. So was a similarly configured Compaq ProLiant. The laptop performed in the middle, at about 15 page faults per second. It is a 90 MHz Pentium with 1 disk drive and 40 MB of memory. All were running Microsoft Windows NT version 3.51 service pack 4. All services except Server and Workstation were turned off. Running the same experiment with Windows NT 4.0 service pack 1 showed approximately the same results, except that the page faults per second numbers ran consistently 10 percent less than in Windows NT 3.51.
The result of this experiment is that we can’t recommend a range to gauge the performance of your machine. The best you can do is turn off all services for a brief period to get a baseline measurement on your machine, then use this value as a guide for your regular usage.
Disk I/O
Several counters measure how busy your disk drives are and which disk drives are the busiest. Remember that for any I/O measurements to be effective, you must run the Windows NT Diskperf -y command and reboot the system.
Even though the SQL Server: I/O Transactions Per Second counter is a bit misleading, it is still good, especially for capacity planning. This counter measures the number of Transact-SQL batches processed since the last refresh period. You should not use these results against any standard TPC benchmark tests that give results in transactions per second — it is not referring to a Begin/Commit transaction, just to batches of commands. Watch this number over a span of several months, because an increase in this counter can indicate that the use of SQL Server is growing.
The SQL Server: I/O — Lazy Writes/Sec counter monitors the number of pages per second that the lazy writer is flushing to disk. The lazy writer is the background Windows NT process that takes the data from cached memory and writes it to disk, although sometimes a lazy writer is hardware that reads the cached memory on the disk drive and is managed by the disk controller. A sustained high rate of lazy writes per second could indicate any of three possible problems:
the Recovery Interval configuration parameter is too short, causing many checkpoints
too little memory is available for page caching
the Free Buffers parameter is set too low
Normally this rate is zero until the least-recently used (LRU) threshold is reached. LRU is the indicator by which memory is released for use by other processes. Buying more memory may be the best solution if the configuration parameters seem to be in line for your server size.
The SQL Server: I/O Outstanding Reads counter and the I/O Outstanding Writes counter measure the number of physical reads and writes pending. These counters are similar to the PhysicalDisk: Disk Queue Length counter. A high value for this counter for a sustained period may point to the disk drives as a bottleneck. Adding memory to the data cache and tuning the read-ahead parameters can decrease the physical reads.
The SQL Server: I/O Page Reads per Second counter is the number of pages not found in SQL Server data cache, which indicates physical reads of data pages from disk. This value does not count pages that are read from the Windows NT virtual memory disk file. There is no way to watch only the logical page reads per second. According to sources in the SQL development team, counters for logical pages reads are hidden in a structure that is not available in this version of SQL Server. However, you can figure out the logical page reads per second by taking the total page reads per second and subtracting the physical page reads per second.
You should occasionally turn on the I/O Single Page Writes counter. A lot of single page writes means you need to tune SQL Server, because it is writing single pages to disk instead of its normal block of pages. Most writes consist of an entire extent (eight pages) and are performed at a checkpoint. The lazywriter handles all the writing of an extent at a time. When SQL is forced to hunt for free pages, it starts finding and writing the LRU pages to disk — one page at a time. A high number of single page writes means that SQL Server does not have enough memory to keep a normal amount of pages in data cache. Your choices are to give more memory to SQL Server by taking memory away from the static buffers, by decreasing the procedure cache, or decreasing the amount of memory allocated to Windows NT.
Log
Tie the SQL Server — Log: Log space used (%) counter to an alert. When the value goes over 80 percent, send a message to the administrator and to the Windows NT event log. When it goes over 90 percent, dump the transaction log to a disk file (not the diskdump device), which will back up the log and truncate it. You want to track this counter for all your application databases, for Tempdb, and for the Distribution database if you are running replication.
Locks
To check out locking, turn on the SQL Server Locks: Total Locks and Total Blocking Locks counters. If you notice a period of heavy locking, turn on some of the other lock counters to get a better breakdown of the problem. The value for Total Blocking Locks should be zero or close to it as often as possible.
One counter to turn on to see if you have configured the system correctly is SQL Server Licensing: Max Client Count. Once you have established that your licensing choice is correct, turn it off. You should turn it back on occasionally to check the connections. If you do exceed the license count, you will know because users will be denied access.
Users
When you suspect that one particular user is the cause of any performance problems, turn on the counters in the Users section. However, with many users on the system, it is difficult to guess which counters to use, and it is difficult to turn on all counters for all sessions. One shortcut is to go into the Current Activity screen of the SQL Enterprise Manager and look at the locks in the Locks tab as well as the changes in CPU and Disk I/O activity in the Detail tab.
Monitor the SQL Server — Users: CPU Time counter for each user. Users for whom this counter returns high values may use inefficient queries. If the query appears reasonable, a high value may indicate an indexing problem or poor database design. Use Showplan to determine if the database’s indexes are optimal. Look for wide tables (long row sizes), which indicate a non-normalized database. Wide tables and inefficient indexes can cause more I/O than table scans.
Other Predefined Counters
A new counter in SQL Server 6.5, SQL Server: Max Tempdb Space Used, indicates how well you have estimated the size of Tempdb. If the value for this counter is very small, you know you have overestimated the size of Tempdb. Be sure to watch this counter frequently, especially during the busiest times and when your nightly jobs run. If it approaches the size of Tempdb, then you should probably increase Tempdb’s size.
Compare SQL Server: NET — Network Reads/Sec to SQL Server: NET — Bytes Received/Sec (or Network Writes/Sec compared to Bytes Transmitted/Sec). If the SQL Server network counters are significantly lower than your server counter, your server is busy processing network packets for applications other than SQL Server. This reading indicates that you are using the server for uses other than SQL Server, perhaps as a primary or backup domain controller, or as a print server, file server, Internet server, or mail server. To get the best performance, make this server a dedicated SQL server and put all the other services on another box.
If you are using replication, you should focus on the publishing machine. You should monitor the distribution machine and the subscriber as well, but the publisher will show the first signs of trouble. Turn on all counters in the SQL Server Replication-Publishing DB object. The three counters will tell you how many transactions are held in the log waiting to be replicated, how many milliseconds each transaction is taking to replicate, and how many transactions per second are being replicated.
User-Defined Counters
Last but not least, you can define counters. The user-defined counters are in the SQL Server User-Defined Counters object in the Master database. The 10 counters correspond to 10 new stored procedures called sp_User_Counter1 through sp_User_Counter10. These stored procedures are the only system stored procedures you should change. If you look at the code of the procedure, they all perform a Select 0, which, when tracked on Performance Monitor, draws a flat line at the bottom of the screen. Replace the Select 0 with a Select statement that returns one number; an integer is preferable, but float, real, and decimal numbers also work. These queries should be quick, not ones that take minutes to run.
Please note that these counters are different from the user counters mentioned earlier, which track the specific activity of a particular person logged in to SQL Server.
The current version of Performance Monitor contains a bug. If User Counter 1 contains an error, none of the 10 counters will show up in Performance Monitor. However, this bug is not the only reason that you might not see these user defined counters in Performance Monitor. The Probe login account, added when you install SQL Server, must have both Select and Execute permission on these 10 stored procedures for them to appear.
It would be nice to be able to change the names of these stored procedures so you could more easily remember what you are tracking. Maybe this feature will be included in version 7.0.
Here is a trick: Suppose you want to count the number of transactions you have in a table. You could put the following statement in sp_User_Counter1:
SELECT COUNT(*) FROM MyDatabase.dbo.MyTable
If MyTable had 40 million rows, the stored procedure would take a lot of time to execute, even though it scans the smallest index to get an accurate count. Instead, you could get an approximate number by using the following command:
SELECT rows
FROM myDatabase.dbo.sysindexes
WHERE id=OBJECT_ID('MyTable')
AND indid in (0,1).
This way is much faster, even though SQL Server does not keep the value in sysindexes up-to-date. Sometimes the counters tracked in sysindexes get out of sync with the actual table, and the only way to get them updated accurately is with DBCC. But most of the time the value in sysindexes is accurate enough.
LONG-TERM PERFORMANCE MONITORING
The concept behind a good long-term strategy for monitoring performance is simple to explain: Use log files to track as many items as you can without affecting performance. We break this discussion into three sections: establishing a baseline, monitoring performance over the long term, and tracking problems.
Establishing a Baseline
First, go to a command prompt and turn on the disk counters using the command Diskperf -y, then reboot. Then establish a new log file, click the + button, add all the options, and start the logging process. Choosing all the options tracks every instance of every counter in every object. You are tracking a lot of information, especially with the physical disk counters turned on.
Run Performance Monitor with this setup for a week; if you wish, you can manually stop and restart the log file every night so that each day is contained in a different log file. These measurements become your baseline; all your trend measurements will be based on this baseline. This method is not a perfect way to establish a baseline if you have very many special activities taking place on your server that week. But you may never experience a “typical” week, and it’s better to get some baseline measurement than wait.
We also recommend that you start a performance notebook. In this notebook, keep a page where you log special activities and events. For instance, an entry in your log might say, “Ran a special query for the big boss to show what a Cartesian product between two million-record tables does to the system.” In your performance notebook, be sure to record changes to the hardware, along with dates and times. You should also schedule actions like backups and transaction log dumps regularly so that when you look at system performance for one night last week, you do not have to wonder whether the backup was running.
We recommend that you run your long-term monitoring from another computer on the network. This way, you are not skewing the results by running it on the server you are trying to monitor. Also, avoid running Perfmon.exe to capture the long-term baseline, because someone must be logged on for it to run, and leaving an administrator machine logged on for long time periods is not a good idea. Instead, run the command-line version of Performance Monitor, called Monitor.exe. It is essentially the same program as Perfmon.exe without the screens. All output can be directed to the log files. To further simplify your life, get Srvany.exe from the Windows NT resource kit and make Monitor.exe into a Windows NT service. This way you can manage Monitor.exe like any other network service.
Periodically, perhaps once every six months, repeat this baseline process with all the counters turned on. Then compare your baselines to establish a trend.
Monitoring Performance over the Long Term
Once you have established your baseline, start another series of log files for your everyday use. First, turn off the physical disk counters with the Diskperf -n command from a command prompt and reboot the system. You can still track everything else if you want to because turning off the physical disk counters reduces the performance problems caused by monitoring. However, it is not necessary to track all the counters. We recommend you track the following objects:
Logical Disk
Memory
Paging File
Processor
Server
SQL Server
SQL Server — Replication (only if you are running replication)
SQL Server — Locks
SQL Server — Log
SQL Server — Procedure Cache
SQL Server — Users
System
Tracking Problems
When you experience performance problems, leave your Performance Monitor running with the log file so you continue to collect long-term data. Then start Performance Monitor again to track the particular problem. Turn on whatever counters you need to look at, using this chapter as a guide for the key counters to monitor in the disk, memory, network, and processors categories.
Start with the high-level counters — look for the words “total” or “percent” (or the % sign). When one of these counters indicates a problem, you usually have the option of watching counters that give you more detail. Learn which counters in different sections are related to each other. The relationships can tell you a lot. For example, the I/O Transactions Per Second counter in the SQL Server section is closely related to the CPU % counter in the processor section. If the number of I/O transactions per second goes up, so does the processor usage.
Concentrate on finding out which resource is causing the problem. Is it the system or a user process? Is it Windows NT or SQL Server? Before you purchase more hardware, try to find a configuration option related to the problem. Don’t hesitate to change hardware configuration or move data to different servers to balance the work among the available resources.
For specific examples of tuning performance, see Chapter 16, “Performance Tuning.”
Special Note: Use log files to track as many items as you can without affecting performance.
Monitoring with Transact-SQL
You can also use three Transact-SQL commands to do your own monitoring:
DBCC MEMUSAGE
DBCC SQLPERF — cumulative from the start of SQL server; use iostats, lru stats, and netstats parameters
DBCC PROCCACHE — six values used by Performance Monitor to monitor procedure cache
The output from these commands can be inserted into a table for long-term tracking and customized reporting. Tracking the MEMUSAGE output calls for some tricky programming because different sections have different output formats. The other two commands are more straightforward.
The example below shows how to capture the DBCC PROCCACHE output. This command displays the same six values that you can display in Performance Monitor to watch the procedure cache usage in SQL Server.
CREATE TABLE PerfTracking
(date_added datetime default (getdate()),
num_proc_buffs int,
num_proc_buffs_used int,
num_proc_buffs_active int,
proc_cache_size int,
proc_cache_used int,
proc_cache_active int)
go
INSERT PerfTracking (num_proc_buffs, num_proc_buffs_used,
num_proc_buffs_active,
proc_cache_size, proc_cache_used, proc_cache_active)
EXEC ("dbcc proccache")
go
After running this command, you can use any SQL Server-compliant report writer or graphing program to create your own fancy graphs.
COUNTERS: A SUMMARY
The list below is a quick reference to the information about counters we’ve presented in this chapter. After the performance questions you may ask, we list related counters.
Is CPU the bottleneck?
system: % total processor time
system: processor queue length
What is SQL Server’s contribution to CPU usage?
SQL Server: CPU Time (all instances)
process: % Processor Time (SQL Server)
Is memory the bottleneck?
memory: page faults/sec (pages not in working set)
memory: pages/sec (physical page faults)
memory: cache faults/sec
What is SQL Server’s contribution to memory usage?
SQL Server: cache hit ratio
SQL Server: RA (all read ahead counters)
process: working set (SQL Server)
Is disk the bottleneck? (Remember that disk counters must be enabled for a true picture.)
physical disk: % disk time
physical disk: avg disk queue length
disk counters: monitor logical disk counters to see which disks are getting the most activity
What is SQL Server’s contribution to disk usage?
SQL Server-users: physical I/O (all instances)
SQL Server: I/O log writes/sec
SQL Server: I/O batch writes/sec
SQL Server: I/O single-page writes
Is the network the bottleneck?
server: bytes received/sec
server: bytes transmitted/sec
What is SQL Server’s contribution to network usage?
SQL Server: NET — Network reads/sec
SQL Server: NET — Network writes/sec
Did I make Tempdb the right size?
SQL Server: Max Tempdb space used (MB)
Is the procedure cache configured properly? (The highwater marks for the percentages are more important than the actual values.)
Max Procedure buffers active %
Max Procedure buffers used %
Max Procedure cache active %
Max Procedure cache used %
SUMMARY
SQL Server 6.5 gives you new configuration and tuning options. It also adds new counters to help you track the use of SQL Server on your system. Use Performance Monitor to see if your system is configured properly. Performance Monitor is one of the best tools you can use to identify current bottlenecks and prevent future problems.
Performance Monitor collects data about different counters, such as memory use. Performance Monitor can show you data in graphical format in real time, or you can save the data to log files. Pay particular attention to the discussion of log files in this section, because you will also use log files for long-term performance monitoring, which we will discuss in detail later. Working with a log file can be difficult to learn on your own because the options are not intuitive and are hidden on different screens.
You can choose between two Performance Monitors: one in the Administrative Tools group and one in the SQL Server group. They are the same basic program, but you need to run the one in the SQL Server group because it automatically loads the SQL Server-related counters. You run this version of the program with the following command: Perfmon.exe C:\Mssql\Binn\ Sqlctrs.pmc, where the .pmc file is the Performance Monitor counter file that contains the SQL counters. You can write applications that provide your own counters, and you can modify the new system stored procedures called sp_user_counter1 through sp_user_counter10 and track them, too.
When you run the program from the SQL Server group, the window in Figure 15.1 appears. Because you started the program that includes the set of SQL counters, five counters appear at the bottom of the window when Performance Monitor starts. The five counters are
Cache Hit Ratio
I/O — Transactions per second
I/O — Page Reads per second
I/O Single Page Writes per second
User Connections
These counters will be explained in more detail later, but first, let’s learn how to navigate in Performance Monitor.
Changing Menu Options
The first set of buttons on the tool bar at the top of the window corresponds to the four views of the monitor: chart, alert, log, and report views. You can get to the same options using the View menu.
The menu options change depending upon which view is currently active. Without going into too much detail about the View menu options, their basic purpose is to let you set up and save standard viewing templates for each of the four views.
Understanding Counters
Windows NT lets you watch the performance of the system by “counting” the activity associated with any of its objects. Examples of objects in Windows NT are processors, disk drives, and processes. Each object has specific counters associated with it; for example, the % User Time counter is associated with a CPU or processor to designate what percent of the CPU is taken up by user programs (as opposed to system processes). This chapter gives you enough information to help you choose the right counters at the right time.
SQL Server includes many predefined counters, most of which you aren’t likely to use except in special cases. It can be difficult to know which counters are the basic ones to watch. If you have chosen the SQL Server Performance Monitor, several counters have been set up as default counters, such as Cache Hit Ratio and User Connections. You can create your own defaults by creating a .pmc file.
The counters are hooks into the operating system and other programs, like SQL Server, that have been built into the software to let Performance Monitor get data. Data collection is performed efficiently so that the additional load on the system is minimized. Windows NT needs most of the information gathered for managing memory, processes, and threads, and Performance Monitor is a good program to display the results.
On the tool bar in Figure 15.1, the button next to the four view buttons at the top of the window is a big plus sign, which you use to add counters to monitor. Click the + button, and the window in Figure 15.2 will appear. The first field, Computer, has a search button at the end of the field. You can click on this field to bring up a list of all computers in your domain and choose a computer from the list, or you can type the name of a server you want to monitor. To monitor other servers, you need Windows NT administrative privileges on them.
In the next field, Object, you choose an object to monitor. The default is Processor, and the default counter shown in the field below is % Processor Time. The box on the right is the Instance. Any particular resource may have more than one instance ; that is, more than one of that particular resource — in this case, processors — may exist. This computer has only one processor (CPU) because the instance in the box is 0. Instance 3 refers to the fourth CPU.
From the fields along the bottom, you can pick the color, scale, line width, and line style of the information that will be displayed about the counter you are adding. These options let you choose a different look for each counter you add to the window. The only display choice that may need explanation is scale. The scale field is a multiplier that helps you fit the values on the screen in the range you have set on the y-axis, which by default is 0 –100.
After you choose the Object, Counter, and Instance you want to monitor and determine how you want the information to appear, click Add. The counter is added at the bottom of the list on the main window (Figure 15.1) and starts graphing the next time your data is refreshed.
If you click the Explain button in Figure 15.2, a brief explanation of the counter you specified will appear (Figure 15.3). Sometimes, though, it uses abbreviations and acronyms that require further research, unless you are a Windows NT internals guru.
Setting Up Alerts
An alert is the warning the computer sends you when a resource such as memory or the network becomes a bottleneck. When an alert occurs, it is written to a log file, along with the date and time it occurred. The log file is a circular file, allowing at most 1,000 entries before it starts overwriting the oldest alerts. The alert can also be written to the Windows NT event log.
To add a new alert, click the second button on the toolbar in Figure 15.1, then click the + button. The dialog box shown in Figure 15.4 will appear. Choose the counters you want to create alerts for, then click Add. The example on the screen will create an alert when the Cache Hit Ratio drops below 85 percent.
Notice the Run Program option in the lower right portion of the screen. You can use it to execute a program when the alert occurs. For example, you can choose SQL Server — Log in the Object field, Log Space Used (%) for the Counter, and the database you want to monitor from the Instance list. When the log file for that database gets above 90 percent, you can execute a batch file that runs an ISQL script to dump the transaction log. In this way you can reduce your chances of running out of log space.
Starting Log Files
Learning how to establish log files is very important, because log files are a critical part of the long-term strategy recommended later in this chapter. It can be a bit confusing, so let’s go through the steps.
Click the third button on the toolbar — View Output Log File Status. Notice that the Log File entry at the top is blank, the status is closed, the file size is zero, and the log interval is 15.00 (seconds).
Click +, and the list of objects shown in Figure 15.5 will appear. Select the ones you want to add to the log and click Add. If you hold down the Ctrl key while selecting, you can choose more than one counter, and holding down Shift lets you highlight all the items in a range. All counters in the objects you pick will be tracked in the log file. We will discuss what to monitor later.
Now we need to specify a log file. From the Options menu, choose Log. The dialog box shown in Figure 15.6 will appear.
This dialog box looks almost like the standard file dialog box, but it has two very important additions. At the bottom of the screen, the Update Time section shows the refresh interval. For short-term tracking, keep it at 15 seconds. For long-term tracking, set it at 300 seconds (5 minutes). The other important difference between this dialog box and the standard file name dialog box is the Start Log button. Nothing happens until you click this button to start collecting data. Once you do, the text of the button will change to Stop Log.
Type a log file name in the File Name box at the top. Then click Start Log.
Click OK to close this dialog box, then minimize the window and let the log run for a while.
Maximize the window and click the Stop Log button. Then switch to the Chart view by clicking the first button on the toolbar.
From the Options menu, choose Data From. Select the log file you named earlier. You can then choose the counters you want to view from the log.
The best part about using log files is that you can view a few counters at a time to avoid overcrowding the window. You can also mix and match the counters you want to analyze at the same time. This feature is important because many of the counters depend on other counters.
Special Note: The log file does not do anything until you click the Start Log button in the Log Options dialog box (also available by choosing Log in the Options menu).
Reports
The fourth button on the toolbar, the Reports button, lets you print customized reports of the data collected in your log file. Experiment with the available reports when you have a chance; we won’t cover this option here.
TIME INTERVALS
The default refresh interval for Performance Monitor is one second. Every second, you get new information about your system’s performance. This interval is good for a very short-term examination of the system, but it can be a drain on the server. A five-second interval causes much less overhead, probably in the neighborhood of five percent extra activity. However, for long-term monitoring, 5 seconds produces a very large log file.
Setting the interval to 5 minutes creates a reasonable size log file, but this large an interval can mask performance peaks. However, because each entry in the log file stores the minimum, average, and maximum values for each counter, or aspect of SQL Server you want to monitor, you can discover the peaks with a little extra analysis. Five minutes is a good setting for long-term logging. You can always fire up another copy of Performance Monitor and look at one- to five-second intervals if you want a short-term peek into the system’s performance.
To determine the amount of drain on the system from Performance Monitor, shut down all the services and start the Monitor again. Add the CPU usage and watch it for about 30 seconds at the default interval of one second. Then change the interval to 0.1 seconds. Your CPU usage will jump dramatically. One odd observation is that the effect of changing from one second to 0.1 seconds is different on different computers, and it is different between Windows NT 4.0 and Windows NT 3.51. For example, when changing the interval on two 133 MHz computers — a laptop and a tower box — the tower machine has the better performance at the shorter interval, showing about 55 percent utilization, while the laptop shows about 60 percent utilization.
Special Note: The faster your refresh option, the more the drain on the system. The default one-second refresh interval creates less than 5 percent overhead on a single-processor machine. For multiprocessor machines, the overhead is negligible. With the refresh interval set to 0.01 seconds, Performance Monitor takes about 60 percent of the resources. At 10 seconds per refresh, the drain is almost too small to measure, even with a lot of counters turned on.
WHAT TO MONITOR
Now that you know how to use the program, let’s get to the section you’ve been waiting for: How do you know what to monitor? Of the hundreds of Windows NT counters and 50 or so SQL counters, how do you choose? Should you monitor everything? How long should you monitor the system?
Monitoring performance helps you perform two related tasks: identifying bottlenecks and planning for your future hardware and software needs (capacity planning). Learning about the important counters will help identify potential bottlenecks. The strategy section later in this chapter will help you put together a good plan for creating a general monitoring strategy.
What do you want to monitor? Everything! Well, monitoring everything may be a good idea for a short period, but the results will show that many of the counters are always at or near zero; monitoring them all the time may be a waste of time and resources. You need to establish a baseline for your system. This baseline lets you know what results are normal and what results indicate a problem. Once you establish a baseline, you don’t need to track everything.
The key categories to monitor can be split into two major sections: Windows NT categories and SQL Server categories. Categories in this sense are groups of objects that contain counters.
Windows NT
Memory
Processor
Disk I/O
Network
SQL Server
Cache
Disk I/O
Log
Locks
Users
Other Predefined Counters
User-Defined Counters
When monitoring both categories of data, look for trends of high and low activity. For example, particular times during the day, certain days of the week, or certain weeks of the month might show more activity than others. After you identify highs and lows, try to redistribute the workload. These peaks and valleys are especially good to know when something new is added to the schedule. If the peak loads are causing problems, identify which things can be scheduled at a later time when the system is not so busy. Knowing the load patterns is also helpful when problems occur, so that you can re-run a particular job or report when the load is low.
Get to know your users — find out which reports they need first thing in the morning. Perhaps you can schedule these reports to run at night in a batch mode, instead of having the user starting them during a busy time.
Monitoring Windows NT
The purpose of monitoring the Windows NT categories is to answer one of two questions: “What resource is my bottleneck?” or “Do I see any upward usage trends that tell me what resource I might run low on first?” SQL Server 6.5 introduced several highwater markers, such as Max Tempdb space used, which make it easier to identify potential long-term problems.
Memory
The Memory: Pages/sec counter is the number of pages read or written to the disk when the system can’t find the page in memory. This page management process is referred to as paging. If the average value for this counter is five, you need to tune the system. If this value is 10 or more, put tuning the server high on your priority list. Before SQL Server 6.0, the value for this counter was an important flag to tell you whether memory was the bottleneck. Now, with SQL Server’s parallel read-ahead feature, this counter will give you only an indication of how busy the read-ahead manager is. However, we will discuss other counters that are better at tracking the read-ahead manager. In other words, this counter may have been one of the most significant counters to track in the past, and it still is on machines without SQL Server, but better ones are available to track memory.
The Memory: Available Bytes counter displays the amount of free physical memory. If the value for this counter is consistently less than 10 percent of your total memory, paging is probably occurring. You have too much memory allocated to SQL Server and not enough to Windows NT.
Processor
Before we start talking about the counters in the processor category, it is important to know that Windows NT assigns certain responsibilities to certain processors if you have four or more CPUs. Processor 0 is the default CPU for the I/O subsystem. Network Interface Cards (NIC) are assigned to the remaining CPUs, starting from the highest-numbered CPU. If you have four processors and one NIC, that card is assigned Processor 3. The next NIC gets Processor 2. Windows NT does a good job of spreading out processor use. You can also set which processors SQL Server uses. See Chapter 16, “Performance Tuning,” particularly the notes on the Affinity Mask, for more information about allocating processors.
You can monitor each processor individually or all the processors together. For monitoring individual processors, use the Processor: % Process Time counter. This counter lets you see which processors are the busiest.
A better counter to monitor over the long term is the System: % Total Processor Time counter, which groups all the processors to tell you the average percentage of time that all processors were busy executing non-idle threads.
Who (or what) is consuming the CPU time? Is it the users, system interrupts, or other system processes? The Processor: Interrupts/sec counter will tell you if it is the system interrupts. A value of more than 1,000 indicates that you should get better network cards, disk controllers, or both. If the Processor: % Privileged Time is greater than 20 percent (of the total processor time) and Processor: % User Time is consistently less than 80 percent, then SQL Server is probably generating excessive I/O requests to the system. If your machine is not a dedicated SQL Server machine, make it so. If none of these situations is occurring, user processes are consuming the CPU. We will look at how to monitor user processes when we consider SQL Server-specific counters in the next section.
Disk I/O
As discussed in Chapter 16, “Performance Tuning,” having many smaller drives is better than having one large drive for SQL Server machines. Let’s say that you need 4 GB of disk space to support your application with SQL Server. Buy four 1-GB drives instead of one 4-GB drive. Even though the seek time is faster on the larger drive, you will still get a tremendous performance improvement by spreading files, tables, and logs among more than one drive.
Special Note: The single best performance increase on a SQL Server box comes from spreading I/O among multiple drives (adding memory is a close second).
Monitor the disk counters to see whether the I/O subsystem is the bottleneck, and if it is, to determine which disk is the culprit. The problem may be the disk controller board. The first thing to know about monitoring disk I/O is that to get accurate readings from the Physical Disk counters, you must go to a command prompt window and type DISKPERF -y, then reboot. This procedure turns on the operating system hooks into the disk subsystem. However, this setup also causes a small performance decrease of 3 to 5 percent, so you want to turn this on only periodically and only for a short period. Use the Diskperf -n command to turn it off, then restart your system.
Track Physical Disk: % Disk Time to see how much time each disk is busy servicing I/O, including time spent waiting in the disk driver queue. If this counter is near 100 percent on a consistent basis, then the physical disk is the bottleneck. Do you rush out and buy another disk? Perhaps that is the best strategy if the other drives are also busy, but you have other options. You may get more benefit from buying another controller and splitting the I/O load between the different controllers. Find out what files or SQL Server tables reside on that disk, and move the busy ones to another drive. If the bottleneck is the system drive, split the virtual memory swap file to another drive, or move the whole file to a less busy drive. You should already have split the swap file, unless you only have one drive (which is very silly on a SQL Server machine).
LogicalDisk: Disk Queue Length and PhysicalDisk: Disk Queue Length can reveal whether particular drives are too busy. These counters track how many requests are waiting in line for the disk to become available. Values of less than 2 are good; if the value is any higher, it’s too high.
Network
Redirector: Read Bytes Network/Sec gives the actual rate at which bytes are being read from the network. Dividing this value by the value for the Redirector: Bytes Received/Sec counter gives the efficiency with which the bytes are being processed.
If this ratio is 1:1, your system is processing network packets as fast as it gets them. If this ratio is below 0.8, then the network packets are coming in faster than your system can process them. To correct this problem on a multiprocessor system, use the Affinity Mask and SMP Concurrency options in the SQL Configuration dialog box to allocate the last processor to the network card, and don’t let SQL Server use that processor. For example, if you have four CPUs, set the Affinity Mask to 7 (binary 0111) and SMP Concurrency to 3. This setup gives three CPUs to SQL Server and the fourth processor to the network card, which Windows NT assigns to that processor by default. If I/O is also a problem, set the Affinity Mask to 6 (binary 0110) and SMP Concurrency to 2, because Windows NT assigns the I/O subsystem to the first processor by default.
Monitoring SQL Server
The questions to ask yourself when monitoring the SQL Server categories are “Do I have the optimal configuration values for SQL Server?” and “Who is consistently using the most resources?”
If any of the counters considered in this section indicate a problem, the problem is somewhere related to SQL Server. If the problem is I/O, memory, CPU, or locks, you can dig deeper and find out who the culprits are. However, if you are using a long-term logging strategy for monitoring, you must monitor every session to be sure you have the necessary historical data when you want to see what was happening at a particular time.
If you are watching the monitor when a problem occurs, go to the SQL Server-Users object and turn on the counter for all instances. The instances in this case are the sessions currently logged on. You can see the login ID and the session number. If you see one or more sessions causing the problem, you can spy on them to find the last command sent. Go to the Enterprise Manager, click the Current Activity button on the toolbar, and double-click the line in the display corresponding to the session number. You will see the last command received from the session. To trace commands in more depth, use the SQLTrace utility that is new with version 6.5. (See Chapter 3, “Administrative and Programming Tools,” for details.)
The five main categories of SQL Server counters to monitor are cache, disk I/O, log, locks, and users. We will consider each of these categories separately as well as a mix of other important predefined counters. The final part of this section discusses the new user-defined counters.
Cache
To monitor your cache, watch SQL Server — Cache Hit Ratio. It monitors the rate at which the system finds pages in memory without having to go to disk. The cache hit ratio is the number of logical reads divided by the total of logical plus physical reads. If the value for this counter is consistently less than 80 percent, you should allocate more memory to SQL Server, buy more system memory, or both. However, before you buy more memory, you can try changing the read-ahead configuration options. Also look at the discussion of free buffers in the next chapter to determine whether the number of free buffers is approaching zero. Changing the free buffers configuration parameter may increase the cache hit ratio.
To find out if you have configured SQL Server properly, you should monitor SQL Server-Procedure Cache: Max Procedure Cache Used (%). If this counter approaches or exceeds 90 percent during normal usage, increase the procedure cache in the SQL Server configuration options. If the maximum cache used is less than 50 percent, you can decrease the configuration value and give more memory to the data cache. Rumor has it that SQL Server 7.0 will have a floating-point number for the procedure cache configuration parameter so that you can give the procedure cache less than 1 percent of your SQL Server memory. For a super server with gigabytes of memory, even 1 percent is too much for procedure cache.
If a 2K data page has been swapped to the Windows NT virtual memory file and read in again later, SQL Server still counts the page as already in memory for the purposes of the Cache Hit Ratio counter. Therefore, a system bogged down by heavy swapping to virtual memory could still show a good cache hit ratio. To find out if your system is in this category, monitor the Memory: Page Faults/Sec counter.
The Memory: Page Faults/Sec counter watches the number of times a page was fetched from virtual memory, meaning that the page had been swapped to the Windows NT swap file. It also adds to the counter the number of pages shared by other processes. This value can be high while system services, including SQL Server, are starting up. If it is consistently high, you may have given too much memory to SQL Server. The network and operating system may not have enough memory to operate efficiently.
Warning: This counter is a strange one to figure out. Running this counter on four different types of machines gave widely different results. To try to get a baseline value, we turned off all services, including SQL Server, unplugged the boxes from the network, and ran Performance Monitor with only the Memory: Page Faults/Sec counter turned on. The lowest measurement of page faults per second was from the system we least expected — a 50 MHz 486 with 16 MB of memory and one disk drive. It settled in at about five to seven page faults per second. The DEC Alpha with 4 processors, 10 GB RAID 5 striping on 5 drives, and 256 MB of memory was up in the 35 to 40 page faults per second range. So was a similarly configured Compaq ProLiant. The laptop performed in the middle, at about 15 page faults per second. It is a 90 MHz Pentium with 1 disk drive and 40 MB of memory. All were running Microsoft Windows NT version 3.51 service pack 4. All services except Server and Workstation were turned off. Running the same experiment with Windows NT 4.0 service pack 1 showed approximately the same results, except that the page faults per second numbers ran consistently 10 percent less than in Windows NT 3.51.
The result of this experiment is that we can’t recommend a range to gauge the performance of your machine. The best you can do is turn off all services for a brief period to get a baseline measurement on your machine, then use this value as a guide for your regular usage.
Disk I/O
Several counters measure how busy your disk drives are and which disk drives are the busiest. Remember that for any I/O measurements to be effective, you must run the Windows NT Diskperf -y command and reboot the system.
Even though the SQL Server: I/O Transactions Per Second counter is a bit misleading, it is still good, especially for capacity planning. This counter measures the number of Transact-SQL batches processed since the last refresh period. You should not use these results against any standard TPC benchmark tests that give results in transactions per second — it is not referring to a Begin/Commit transaction, just to batches of commands. Watch this number over a span of several months, because an increase in this counter can indicate that the use of SQL Server is growing.
The SQL Server: I/O — Lazy Writes/Sec counter monitors the number of pages per second that the lazy writer is flushing to disk. The lazy writer is the background Windows NT process that takes the data from cached memory and writes it to disk, although sometimes a lazy writer is hardware that reads the cached memory on the disk drive and is managed by the disk controller. A sustained high rate of lazy writes per second could indicate any of three possible problems:
the Recovery Interval configuration parameter is too short, causing many checkpoints
too little memory is available for page caching
the Free Buffers parameter is set too low
Normally this rate is zero until the least-recently used (LRU) threshold is reached. LRU is the indicator by which memory is released for use by other processes. Buying more memory may be the best solution if the configuration parameters seem to be in line for your server size.
The SQL Server: I/O Outstanding Reads counter and the I/O Outstanding Writes counter measure the number of physical reads and writes pending. These counters are similar to the PhysicalDisk: Disk Queue Length counter. A high value for this counter for a sustained period may point to the disk drives as a bottleneck. Adding memory to the data cache and tuning the read-ahead parameters can decrease the physical reads.
The SQL Server: I/O Page Reads per Second counter is the number of pages not found in SQL Server data cache, which indicates physical reads of data pages from disk. This value does not count pages that are read from the Windows NT virtual memory disk file. There is no way to watch only the logical page reads per second. According to sources in the SQL development team, counters for logical pages reads are hidden in a structure that is not available in this version of SQL Server. However, you can figure out the logical page reads per second by taking the total page reads per second and subtracting the physical page reads per second.
You should occasionally turn on the I/O Single Page Writes counter. A lot of single page writes means you need to tune SQL Server, because it is writing single pages to disk instead of its normal block of pages. Most writes consist of an entire extent (eight pages) and are performed at a checkpoint. The lazywriter handles all the writing of an extent at a time. When SQL is forced to hunt for free pages, it starts finding and writing the LRU pages to disk — one page at a time. A high number of single page writes means that SQL Server does not have enough memory to keep a normal amount of pages in data cache. Your choices are to give more memory to SQL Server by taking memory away from the static buffers, by decreasing the procedure cache, or decreasing the amount of memory allocated to Windows NT.
Log
Tie the SQL Server — Log: Log space used (%) counter to an alert. When the value goes over 80 percent, send a message to the administrator and to the Windows NT event log. When it goes over 90 percent, dump the transaction log to a disk file (not the diskdump device), which will back up the log and truncate it. You want to track this counter for all your application databases, for Tempdb, and for the Distribution database if you are running replication.
Locks
To check out locking, turn on the SQL Server Locks: Total Locks and Total Blocking Locks counters. If you notice a period of heavy locking, turn on some of the other lock counters to get a better breakdown of the problem. The value for Total Blocking Locks should be zero or close to it as often as possible.
One counter to turn on to see if you have configured the system correctly is SQL Server Licensing: Max Client Count. Once you have established that your licensing choice is correct, turn it off. You should turn it back on occasionally to check the connections. If you do exceed the license count, you will know because users will be denied access.
Users
When you suspect that one particular user is the cause of any performance problems, turn on the counters in the Users section. However, with many users on the system, it is difficult to guess which counters to use, and it is difficult to turn on all counters for all sessions. One shortcut is to go into the Current Activity screen of the SQL Enterprise Manager and look at the locks in the Locks tab as well as the changes in CPU and Disk I/O activity in the Detail tab.
Monitor the SQL Server — Users: CPU Time counter for each user. Users for whom this counter returns high values may use inefficient queries. If the query appears reasonable, a high value may indicate an indexing problem or poor database design. Use Showplan to determine if the database’s indexes are optimal. Look for wide tables (long row sizes), which indicate a non-normalized database. Wide tables and inefficient indexes can cause more I/O than table scans.
Other Predefined Counters
A new counter in SQL Server 6.5, SQL Server: Max Tempdb Space Used, indicates how well you have estimated the size of Tempdb. If the value for this counter is very small, you know you have overestimated the size of Tempdb. Be sure to watch this counter frequently, especially during the busiest times and when your nightly jobs run. If it approaches the size of Tempdb, then you should probably increase Tempdb’s size.
Compare SQL Server: NET — Network Reads/Sec to SQL Server: NET — Bytes Received/Sec (or Network Writes/Sec compared to Bytes Transmitted/Sec). If the SQL Server network counters are significantly lower than your server counter, your server is busy processing network packets for applications other than SQL Server. This reading indicates that you are using the server for uses other than SQL Server, perhaps as a primary or backup domain controller, or as a print server, file server, Internet server, or mail server. To get the best performance, make this server a dedicated SQL server and put all the other services on another box.
If you are using replication, you should focus on the publishing machine. You should monitor the distribution machine and the subscriber as well, but the publisher will show the first signs of trouble. Turn on all counters in the SQL Server Replication-Publishing DB object. The three counters will tell you how many transactions are held in the log waiting to be replicated, how many milliseconds each transaction is taking to replicate, and how many transactions per second are being replicated.
User-Defined Counters
Last but not least, you can define counters. The user-defined counters are in the SQL Server User-Defined Counters object in the Master database. The 10 counters correspond to 10 new stored procedures called sp_User_Counter1 through sp_User_Counter10. These stored procedures are the only system stored procedures you should change. If you look at the code of the procedure, they all perform a Select 0, which, when tracked on Performance Monitor, draws a flat line at the bottom of the screen. Replace the Select 0 with a Select statement that returns one number; an integer is preferable, but float, real, and decimal numbers also work. These queries should be quick, not ones that take minutes to run.
Please note that these counters are different from the user counters mentioned earlier, which track the specific activity of a particular person logged in to SQL Server.
The current version of Performance Monitor contains a bug. If User Counter 1 contains an error, none of the 10 counters will show up in Performance Monitor. However, this bug is not the only reason that you might not see these user defined counters in Performance Monitor. The Probe login account, added when you install SQL Server, must have both Select and Execute permission on these 10 stored procedures for them to appear.
It would be nice to be able to change the names of these stored procedures so you could more easily remember what you are tracking. Maybe this feature will be included in version 7.0.
Here is a trick: Suppose you want to count the number of transactions you have in a table. You could put the following statement in sp_User_Counter1:
SELECT COUNT(*) FROM MyDatabase.dbo.MyTable
If MyTable had 40 million rows, the stored procedure would take a lot of time to execute, even though it scans the smallest index to get an accurate count. Instead, you could get an approximate number by using the following command:
SELECT rows
FROM myDatabase.dbo.sysindexes
WHERE id=OBJECT_ID('MyTable')
AND indid in (0,1).
This way is much faster, even though SQL Server does not keep the value in sysindexes up-to-date. Sometimes the counters tracked in sysindexes get out of sync with the actual table, and the only way to get them updated accurately is with DBCC. But most of the time the value in sysindexes is accurate enough.
LONG-TERM PERFORMANCE MONITORING
The concept behind a good long-term strategy for monitoring performance is simple to explain: Use log files to track as many items as you can without affecting performance. We break this discussion into three sections: establishing a baseline, monitoring performance over the long term, and tracking problems.
Establishing a Baseline
First, go to a command prompt and turn on the disk counters using the command Diskperf -y, then reboot. Then establish a new log file, click the + button, add all the options, and start the logging process. Choosing all the options tracks every instance of every counter in every object. You are tracking a lot of information, especially with the physical disk counters turned on.
Run Performance Monitor with this setup for a week; if you wish, you can manually stop and restart the log file every night so that each day is contained in a different log file. These measurements become your baseline; all your trend measurements will be based on this baseline. This method is not a perfect way to establish a baseline if you have very many special activities taking place on your server that week. But you may never experience a “typical” week, and it’s better to get some baseline measurement than wait.
We also recommend that you start a performance notebook. In this notebook, keep a page where you log special activities and events. For instance, an entry in your log might say, “Ran a special query for the big boss to show what a Cartesian product between two million-record tables does to the system.” In your performance notebook, be sure to record changes to the hardware, along with dates and times. You should also schedule actions like backups and transaction log dumps regularly so that when you look at system performance for one night last week, you do not have to wonder whether the backup was running.
We recommend that you run your long-term monitoring from another computer on the network. This way, you are not skewing the results by running it on the server you are trying to monitor. Also, avoid running Perfmon.exe to capture the long-term baseline, because someone must be logged on for it to run, and leaving an administrator machine logged on for long time periods is not a good idea. Instead, run the command-line version of Performance Monitor, called Monitor.exe. It is essentially the same program as Perfmon.exe without the screens. All output can be directed to the log files. To further simplify your life, get Srvany.exe from the Windows NT resource kit and make Monitor.exe into a Windows NT service. This way you can manage Monitor.exe like any other network service.
Periodically, perhaps once every six months, repeat this baseline process with all the counters turned on. Then compare your baselines to establish a trend.
Monitoring Performance over the Long Term
Once you have established your baseline, start another series of log files for your everyday use. First, turn off the physical disk counters with the Diskperf -n command from a command prompt and reboot the system. You can still track everything else if you want to because turning off the physical disk counters reduces the performance problems caused by monitoring. However, it is not necessary to track all the counters. We recommend you track the following objects:
Logical Disk
Memory
Paging File
Processor
Server
SQL Server
SQL Server — Replication (only if you are running replication)
SQL Server — Locks
SQL Server — Log
SQL Server — Procedure Cache
SQL Server — Users
System
Tracking Problems
When you experience performance problems, leave your Performance Monitor running with the log file so you continue to collect long-term data. Then start Performance Monitor again to track the particular problem. Turn on whatever counters you need to look at, using this chapter as a guide for the key counters to monitor in the disk, memory, network, and processors categories.
Start with the high-level counters — look for the words “total” or “percent” (or the % sign). When one of these counters indicates a problem, you usually have the option of watching counters that give you more detail. Learn which counters in different sections are related to each other. The relationships can tell you a lot. For example, the I/O Transactions Per Second counter in the SQL Server section is closely related to the CPU % counter in the processor section. If the number of I/O transactions per second goes up, so does the processor usage.
Concentrate on finding out which resource is causing the problem. Is it the system or a user process? Is it Windows NT or SQL Server? Before you purchase more hardware, try to find a configuration option related to the problem. Don’t hesitate to change hardware configuration or move data to different servers to balance the work among the available resources.
For specific examples of tuning performance, see Chapter 16, “Performance Tuning.”
Special Note: Use log files to track as many items as you can without affecting performance.
Monitoring with Transact-SQL
You can also use three Transact-SQL commands to do your own monitoring:
DBCC MEMUSAGE
DBCC SQLPERF — cumulative from the start of SQL server; use iostats, lru stats, and netstats parameters
DBCC PROCCACHE — six values used by Performance Monitor to monitor procedure cache
The output from these commands can be inserted into a table for long-term tracking and customized reporting. Tracking the MEMUSAGE output calls for some tricky programming because different sections have different output formats. The other two commands are more straightforward.
The example below shows how to capture the DBCC PROCCACHE output. This command displays the same six values that you can display in Performance Monitor to watch the procedure cache usage in SQL Server.
CREATE TABLE PerfTracking
(date_added datetime default (getdate()),
num_proc_buffs int,
num_proc_buffs_used int,
num_proc_buffs_active int,
proc_cache_size int,
proc_cache_used int,
proc_cache_active int)
go
INSERT PerfTracking (num_proc_buffs, num_proc_buffs_used,
num_proc_buffs_active,
proc_cache_size, proc_cache_used, proc_cache_active)
EXEC ("dbcc proccache")
go
After running this command, you can use any SQL Server-compliant report writer or graphing program to create your own fancy graphs.
COUNTERS: A SUMMARY
The list below is a quick reference to the information about counters we’ve presented in this chapter. After the performance questions you may ask, we list related counters.
Is CPU the bottleneck?
system: % total processor time
system: processor queue length
What is SQL Server’s contribution to CPU usage?
SQL Server: CPU Time (all instances)
process: % Processor Time (SQL Server)
Is memory the bottleneck?
memory: page faults/sec (pages not in working set)
memory: pages/sec (physical page faults)
memory: cache faults/sec
What is SQL Server’s contribution to memory usage?
SQL Server: cache hit ratio
SQL Server: RA (all read ahead counters)
process: working set (SQL Server)
Is disk the bottleneck? (Remember that disk counters must be enabled for a true picture.)
physical disk: % disk time
physical disk: avg disk queue length
disk counters: monitor logical disk counters to see which disks are getting the most activity
What is SQL Server’s contribution to disk usage?
SQL Server-users: physical I/O (all instances)
SQL Server: I/O log writes/sec
SQL Server: I/O batch writes/sec
SQL Server: I/O single-page writes
Is the network the bottleneck?
server: bytes received/sec
server: bytes transmitted/sec
What is SQL Server’s contribution to network usage?
SQL Server: NET — Network reads/sec
SQL Server: NET — Network writes/sec
Did I make Tempdb the right size?
SQL Server: Max Tempdb space used (MB)
Is the procedure cache configured properly? (The highwater marks for the percentages are more important than the actual values.)
Max Procedure buffers active %
Max Procedure buffers used %
Max Procedure cache active %
Max Procedure cache used %
SUMMARY
SQL Server 6.5 gives you new configuration and tuning options. It also adds new counters to help you track the use of SQL Server on your system. Use Performance Monitor to see if your system is configured properly. Performance Monitor is one of the best tools you can use to identify current bottlenecks and prevent future problems.
Desain Database by Yulius Eka Agung Seputra,ST,MSi
Basis data (database) merupakan kumpulan dari data yang saling berhubungan satu dengan yang lainnya, tersimpan di simpanan luar komputer dan digunakan perangkat lunak tertentu untuk memanipulasinya. Database merupakan salah satu komponen yang penting di sistem informasi, karena berfungsi sebagai basis penyedia informasi bagi para pemakainya. Penerapa database dalam sistem informasi disebut dengan database system. Sistem basis data (database system) ini adalah suatu sistem informasi yang mengintegrasikan kumpulan dari data yang saling berhubungan satu dengan lainnya dan
membuatnya tersedia untuk beberapa aplikasi yang bermacam-macam di dalam suatu organisasi. Tujuan dari desain database adalah untuk menentukan data-data yang
dibutuhkan dalam sistem, sehingga informasi yang dihasilkan dapat terpenuhi dengan baik. Terdapat beberapa alasan mengapa desain database perlu untuk dilakukan, salah satu adalah untuk menghindari pengulangan data.
Adapun metode untuk meminimasi pengulangan data (data redudancy) antara lain dengan :
a. Normalisasi.
b. Dekomposisi lossless.
Diperlukan jika ada indikasi bahwa tabel yang kita buat tidak baik (terjadi pengulangan informasi, potensi inkonsistensi data pada operasi pengubahan, tersembunyinya informasi tertentu) dan diperlukan supaya jika tabel-tabel yang didekomposisi kita gabungkan kembali dapat menghasilkan tabel awal sebelum didekomposisi, sehingga diperoleh tabel yang baik.
c. ERD (Entity Relationship Diagram).
d. Menentukan kardinalitas relasi.
Terdapat beberapa pengertian tentang key sehubungan dengan normalisasi dan ERD, antara lain :
a. Superkey adalah gugus dari sejumlah atribut entiti yang dapat digunakan untuk mengidentifikasi obyek secara unik.
b. Candidate key adalah superkey dengan jumlah atribut minimal dan dapat berdiri sendiri.
c. Primary key adalah superkey yang dipilih oleh desainer atau administrator basis data.
Normalisasi.
Adalah proses yang berkaitan dengan model data relational untuk mengorganisasi himpunan data dengan ketergantungan dan keterkaitan yang tinggi atau erat. Hasil dari proses normalisasi adalah himpunan-himpunan data dalam bentuk normal (normal form). Ada beberapa bentuk normal, yaitu :
a. Bentuk Normal I (First Normal Form / 1-NF).
b. Bentuk Normal II (Second Normal Form / 2-NF).
c. Bentuk Normal III (Third Normal Form / 3-NF).
d. Bentuk Normal IV (Fourth Normal Form / 4-NF).
e. Bentuk Normal Boyce-Codd (Boyce-Codd Normal Form / BCNF).
f. Project-Join Normal I Form (PJNF).
g. Domain-Key Normal I Form (DKNF).
h. Bentuk Normal V (Fifth Normal Form / 5-NF).
Kegunaan normalisasi :
a. Meminimasi pengulangan informasi.
b. Memudahkan indentifikasi entiti / obyek.
Bentuk Normal I (First Normal Form / 1-NF).
Suatu relasi memenuhi 1-NF jika dan hanya jika setiap atribut dari relasi
tersebut hanya memiliki nilai tunggal dalam satu baris atau record.

Tabel 4.1 : Bentuk tidak Unnormalized Form (Non 1-NF table)

Tabel 4.2 : Bentuk 1-NF table
Bentuk Normal II (Second Normal Form / 2-NF).
Suatu relasi memenuhi 2-NF jika dan hanya jika :
a. Memenuhi 1-NF.
b. Setiap atribut yang bukan kunci utama tergantung secara fungsional terhadap semua atribut kunci dan bukan hanya sebagian atribut.
Jika suatu relasi memenuhi 1-NF dan relasi tersebut memiliki tepat satu atribut yang membentuk kunci utama, maka relasi tersebut memenuhi 2-NF.
Rasionalisasi 2-NF :
a. Memiliki semantik yang lebih eksplisit dari 1-NF.
b. Mencegah beberapa kondisi anomali dalam update data.

Tabel 4.3 : Bentuk 2-NF table (satisfying 1-NF).
Ketergantungan fungsional dilakukan untuk :
a. StudentID => Student, BirthDate (SC1).
b. CourseID => Course, Credit (SC2).
c. StudentID, CourseID => Grade (SC3, SC3A).
d. Grade => Weight (SC3B).

Tabel 4.4 : Tabel yang memenuhi 2-NF.

Tabel 4.5 : Tabel yang memenuhi 3-NF.
Akhirnya semua tabel SC1, SC2, SC3A, SC3B berada dalam kondisi 3-NF, sehingga semua databases mengalami kondisi 3-NF.
Bentuk Normal III (Third Normal Form / 1-NF).
Suatu relasi memenuhi bentuk III (3-NF) jika dan hanya jika :
a. Relasi tersebut memenuhi 2-NF.
b. Setiap atribut bukan kunci tidak tergantung secara fungsional kepada atribut bukan kunci yang lain dalam relasi tersebut.
Suatu relasi yang memenuhi 2-Nf dan hanya memiliki satu atribut bukan kunci selalu memenuhi 3-NF.
Bentuk Normal Boyce-Codd (Boyce-Codd Normal Form / BCNF).
Suatu relasi memenuhi BCNF jika dan hanya jika setiap determinan yang ada pada relasi tersebut adalah kunci kandidat (candidate keys). Determinan adalah gugus atribut dimanaa satu atau lebih atribut lain tergantung secara fungsional.
Model Hubungan atau Relasi Entiti (Entity Realtionship (E-R) Model).
Model relasi entiti didasarkan pada persepsi dunia nyata yang terdiri dari himpunan obyek dasar yang disebut entiti dan relasi antar entiti. Entiti adalah obyek yang dapat diidentifikasi secara unik.Entiti dikarakterisasi dan dipresentasikan dengan suatu gugus atribut. Contoh gugus atribut dari entiti PEKERJA adalah nama, tanggal lahir, NIP,golongan/pangkat.Sekelompok entiti yang memiliki karakterisasi entiti disebut gugus entiti(entity set).Setiap entiti dari gugus tersebut disebut anggota gugus (member of set).Contoh gugus entiti adalah gugus entiti pegawai bank, gugus entiti nasabah bank. Dari beberapa gugus tadi mungkin terjadi suatu relasi, misalnya relasi antara gugus bank dengan gugus nasabah bank.Berdasarkan jumlah gugus yang terlibat maka relasi antar entiti dibedakan menjadi :
a. Relasi biner (binary), yaitu relasi antar 2 gugus entiti.
b. Relasi trio (ternary), yaitu relasi antar 3 gugus entiti.
c. Relasi N-ary, yaitu relasi antar n gugus entiti.
Khusus untuk relasi biner maka relasi antar anggota dari dua gugus yang terlibat (kardinalitas relasi biner) dapat bersifat :
a. Relasi 1-1 (one-to-one relationship).
Adalah satu entiti anggota gugus diasosiasikan dengan tepat satu entiti anggota gugus yang lain.

b. Relasi 1-banyak (one-to-many relationship).
Adalah satu entiti anggota gugus diasosiasikan dengan satu atau lebih entiti anggota gugus yang lain. Sebaliknya satu entiti anggota gugus yang lain tersebut diasosiasikan dengan tepat satu entiti anggota gugus pasangannya.

c. Relasi banyak-1 (many-to-one relationship).
Adalah satu entiti anggota gugus diasosiasikan dengan satu atau lebih entiti anggota gugus yang lain dan berlaku pula sebaliknya.

Menterjemahkan ERD ke Tabel

Tipe file.
Database dibentuk dari kumpulan file. File di dalam pemrosesan aplikasi dapat dikategorikan ke dalam beberapa tipe, diantaranya yaitu sebagai berikut :
1. File induk (master file).
Didalam aplikasi, file ini merupakan file yang penting. File ini tetap terus ada selama hidup dari sistem informasi. File induk dapat dibedakan lagi menjadi :
a. File induk acuan (reference master file), yaitu file induk yang recordnya relatif statis, jarang berubah nilainya. Contoh dari file ini adalah file daftar gaji, file daftar matakuliah.
b. File induk dinamik (dynamic master file), yaitu file induk yang nilai dari record-recordnya sering berubah atau sering dimutakhirkan (updated) sebagai akibat dari suatu transaksi. Contoh file ini adalah file induk persediaan, file induk langganan dan lain sebagainya.
2. File transaksi (transaction file).
File transaksi disebut juga dengan nama file input (input file). File ini digunakan untuk merekam data hasil dari suatu transaksi yang terjadi. Misalnya nilai unit suatu barang dapat diketahui dari file induk persediaan. File induk ini hanya menunjukkan status unit akhir dari barang yang dimaksud. Sedang uni akhir ini berasal dari transaksi- transaksi yang pernah terjadi. Untuk melihat transaksi-transaksi yang mempengaruhi nilai di file induk, maka dapat dilihat pada file transaksinya. Contoh file transaksi yang lain adalah file transaksi penjualan yang berisi data tentang transaksi penjualan yang terjadi. Biasanya file transaksi memuat rekaman tanggal dari transaksinya yang menunjukkan kapan transaksi tersebut terjadi.
3. File laporan (report file).
File ini disebut juga dengan file output (output file), yaitu file yang berisi dengan informasi yang akan ditampilkan. File ini dibuat untuk mempersiapkan pembuatan suatu laporan dan biasanya dilakukan bila printer belum siap atau masih digunakan oleh proses yang lain.
4. File sejarah (history file).
File sejarah dibuat judan dengan file arsip (archival file), yaitu file yang berisi dengan data masa lalu yang sudah tidak aktif lagi, tetapi perlu disimpan untuk keperluan mendatang.
5. File pelindung (backup file).
File pelindung merupakan salinan dari file-file yang masih aktif didatabase pada suatu saat tertentu. File ini digunakan sebagai cadangan atau pelindung bila file database yang aktif rusak atau hilang.
6. File kerja (working file).
File kerja disebut juga dengan nama file sementara (temprorary file) atau scratch file. File ini dibuat oleh suatu proses program secara sementara karena memori komputer tidak mecukupi atau untuk menghemat pemakaian memori selama proses dan akan dihapus bila proses telah selesai.
Akses dan organisasi file.
Akses file (access file) adalah suatu metode yang menunjukkan bagaimana suatu program komputer akan membaca record-record dari suatu file.File dapat diakses dengan dua cara yaitu secara urut (sequential access) atau secara langsung (direct access atau random access). Metode akses urut(sequential access method) dilakukan dengan membaca atau menulis suatu record di file dengan membaca terlebih dahulu mulai dari record pertama,urut sampai dengan record yang diinginkan. Metode akses langsung (direct access method) dilakukan dengan cara langung membaca record pada posisinya di file tanpa membaca dari record pertama terlebih dahulu. Organisasi file adalah pengaturan dari record secara logika didalam file dihubungkan satu dengan yang lainnya. File dapat diorganisasikan secara urut (sequential organization) atau secara acak (random organization). Walaupun organisasi file dan pengaksesan file dapat dipandang secara
terpisah, tetapi biasanya pembahasan mengenai organisasi file menyangkut
keduanya, yaitu sebagai berikut :
a. File urut (sequential file) merupakan file dengan organisasi urut
(sequential organization) dengan pengaksesan secara urut (sequential
access).
b. File urut berindeks (indexed sequential file) atau sering disebut dengan
ISAM (indexed sequential access method) merupakan file dengan organisasi
urut (sequential organization) dengan pengaksesan secara langsung
(direct access).
c. File akses langsung (direct access file) atau disebut dengan file alamat
langsung (direct address file) merupakan file dengan organisasi acak
(random organization) dengan pengaksesan langsung (direct access).
Organisasi file seperti ini disebut dengan organisasi file tradisional atau
konvensional, karena telah ada sebelum struktur database dikembangkan.
Organisasi file database dapat berbentuk struktur data berjenjang
(hierarchical data structure), struktur data jaringa (network data structure)
dan struktur data hubungan (relational data structure). Struktur data
hubungan merupakan organisasi file database yang terbaru dan mudah dipahami.
Struktur data hubungan mempunyai karakteristik sebagai berikut :
a. File dalam bentuk tabel yang persis dengan file urut.
b. Hubungan antara record didasarkan pada nilai dari field kunci, bukan
berdasarkan alamat atau pointer.
Struktur data hubungan makin banyak digunakan pada paket-paket DBMS, seperti
misalnya Dbase, Foxbase, Sql dan sebagainya.
Langkah-langkah desain database.
Untuk tahap desain database yang perlu dilakukan adalah mengidentifikasi terlebih dahulu file-file yang diperlukan dalam sistem informasi yang dibangun. File-fila database yang dibutuhkan oleh sistem dapat dilihat pada desain model yang digambarkan dalam bentuk diagram arus data (DFD).
Langkah-langkah desain database secara umum adalah sebagai berikut :
a. Menentukan kebutuhan file database untuk sistem yang baru.
File yang dibutuhkan dapat ditentukan dari DAD sistem baru yang telah dibuat.
b. Menentukan parameter daru file database.
Setelah file-file yang dibutuhkan telah dapat ditentukan, maka parameter dari file selanjutnya juga dapat ditentukan. Parameter tersebut, meliputi:
· Tipe dari file : file induk, file transaksi, file sementara (temporary).
· Media dari file : hardisk, disket, pita magnetik, CD.
· Organisasi dari file : fila sequential, random, berindek.
· Field kunci dari file.
Analis sistem dapat menggunakan formulir berikut untuk mengidentifikasi file database yang akan didesain, sebagai berikut :

Tabel 4.6 : Tabel identifikasi kebutuhan file.

Tabel 4.7 : Tabel identifikasi atribut (field) dalam sebuah file.
membuatnya tersedia untuk beberapa aplikasi yang bermacam-macam di dalam suatu organisasi. Tujuan dari desain database adalah untuk menentukan data-data yang
dibutuhkan dalam sistem, sehingga informasi yang dihasilkan dapat terpenuhi dengan baik. Terdapat beberapa alasan mengapa desain database perlu untuk dilakukan, salah satu adalah untuk menghindari pengulangan data.
Adapun metode untuk meminimasi pengulangan data (data redudancy) antara lain dengan :
a. Normalisasi.
b. Dekomposisi lossless.
Diperlukan jika ada indikasi bahwa tabel yang kita buat tidak baik (terjadi pengulangan informasi, potensi inkonsistensi data pada operasi pengubahan, tersembunyinya informasi tertentu) dan diperlukan supaya jika tabel-tabel yang didekomposisi kita gabungkan kembali dapat menghasilkan tabel awal sebelum didekomposisi, sehingga diperoleh tabel yang baik.
c. ERD (Entity Relationship Diagram).
d. Menentukan kardinalitas relasi.
Terdapat beberapa pengertian tentang key sehubungan dengan normalisasi dan ERD, antara lain :
a. Superkey adalah gugus dari sejumlah atribut entiti yang dapat digunakan untuk mengidentifikasi obyek secara unik.
b. Candidate key adalah superkey dengan jumlah atribut minimal dan dapat berdiri sendiri.
c. Primary key adalah superkey yang dipilih oleh desainer atau administrator basis data.
Normalisasi.
Adalah proses yang berkaitan dengan model data relational untuk mengorganisasi himpunan data dengan ketergantungan dan keterkaitan yang tinggi atau erat. Hasil dari proses normalisasi adalah himpunan-himpunan data dalam bentuk normal (normal form). Ada beberapa bentuk normal, yaitu :
a. Bentuk Normal I (First Normal Form / 1-NF).
b. Bentuk Normal II (Second Normal Form / 2-NF).
c. Bentuk Normal III (Third Normal Form / 3-NF).
d. Bentuk Normal IV (Fourth Normal Form / 4-NF).
e. Bentuk Normal Boyce-Codd (Boyce-Codd Normal Form / BCNF).
f. Project-Join Normal I Form (PJNF).
g. Domain-Key Normal I Form (DKNF).
h. Bentuk Normal V (Fifth Normal Form / 5-NF).
Kegunaan normalisasi :
a. Meminimasi pengulangan informasi.
b. Memudahkan indentifikasi entiti / obyek.
Bentuk Normal I (First Normal Form / 1-NF).
Suatu relasi memenuhi 1-NF jika dan hanya jika setiap atribut dari relasi
tersebut hanya memiliki nilai tunggal dalam satu baris atau record.
Tabel 4.1 : Bentuk tidak Unnormalized Form (Non 1-NF table)
Tabel 4.2 : Bentuk 1-NF table
Bentuk Normal II (Second Normal Form / 2-NF).
Suatu relasi memenuhi 2-NF jika dan hanya jika :
a. Memenuhi 1-NF.
b. Setiap atribut yang bukan kunci utama tergantung secara fungsional terhadap semua atribut kunci dan bukan hanya sebagian atribut.
Jika suatu relasi memenuhi 1-NF dan relasi tersebut memiliki tepat satu atribut yang membentuk kunci utama, maka relasi tersebut memenuhi 2-NF.
Rasionalisasi 2-NF :
a. Memiliki semantik yang lebih eksplisit dari 1-NF.
b. Mencegah beberapa kondisi anomali dalam update data.
Tabel 4.3 : Bentuk 2-NF table (satisfying 1-NF).
Ketergantungan fungsional dilakukan untuk :
a. StudentID => Student, BirthDate (SC1).
b. CourseID => Course, Credit (SC2).
c. StudentID, CourseID => Grade (SC3, SC3A).
d. Grade => Weight (SC3B).
Tabel 4.4 : Tabel yang memenuhi 2-NF.
Tabel 4.5 : Tabel yang memenuhi 3-NF.
Akhirnya semua tabel SC1, SC2, SC3A, SC3B berada dalam kondisi 3-NF, sehingga semua databases mengalami kondisi 3-NF.
Bentuk Normal III (Third Normal Form / 1-NF).
Suatu relasi memenuhi bentuk III (3-NF) jika dan hanya jika :
a. Relasi tersebut memenuhi 2-NF.
b. Setiap atribut bukan kunci tidak tergantung secara fungsional kepada atribut bukan kunci yang lain dalam relasi tersebut.
Suatu relasi yang memenuhi 2-Nf dan hanya memiliki satu atribut bukan kunci selalu memenuhi 3-NF.
Bentuk Normal Boyce-Codd (Boyce-Codd Normal Form / BCNF).
Suatu relasi memenuhi BCNF jika dan hanya jika setiap determinan yang ada pada relasi tersebut adalah kunci kandidat (candidate keys). Determinan adalah gugus atribut dimanaa satu atau lebih atribut lain tergantung secara fungsional.
Model Hubungan atau Relasi Entiti (Entity Realtionship (E-R) Model).
Model relasi entiti didasarkan pada persepsi dunia nyata yang terdiri dari himpunan obyek dasar yang disebut entiti dan relasi antar entiti. Entiti adalah obyek yang dapat diidentifikasi secara unik.Entiti dikarakterisasi dan dipresentasikan dengan suatu gugus atribut. Contoh gugus atribut dari entiti PEKERJA adalah nama, tanggal lahir, NIP,golongan/pangkat.Sekelompok entiti yang memiliki karakterisasi entiti disebut gugus entiti(entity set).Setiap entiti dari gugus tersebut disebut anggota gugus (member of set).Contoh gugus entiti adalah gugus entiti pegawai bank, gugus entiti nasabah bank. Dari beberapa gugus tadi mungkin terjadi suatu relasi, misalnya relasi antara gugus bank dengan gugus nasabah bank.Berdasarkan jumlah gugus yang terlibat maka relasi antar entiti dibedakan menjadi :
a. Relasi biner (binary), yaitu relasi antar 2 gugus entiti.
b. Relasi trio (ternary), yaitu relasi antar 3 gugus entiti.
c. Relasi N-ary, yaitu relasi antar n gugus entiti.
Khusus untuk relasi biner maka relasi antar anggota dari dua gugus yang terlibat (kardinalitas relasi biner) dapat bersifat :
a. Relasi 1-1 (one-to-one relationship).
Adalah satu entiti anggota gugus diasosiasikan dengan tepat satu entiti anggota gugus yang lain.
b. Relasi 1-banyak (one-to-many relationship).
Adalah satu entiti anggota gugus diasosiasikan dengan satu atau lebih entiti anggota gugus yang lain. Sebaliknya satu entiti anggota gugus yang lain tersebut diasosiasikan dengan tepat satu entiti anggota gugus pasangannya.
c. Relasi banyak-1 (many-to-one relationship).
Adalah satu entiti anggota gugus diasosiasikan dengan satu atau lebih entiti anggota gugus yang lain dan berlaku pula sebaliknya.
Menterjemahkan ERD ke Tabel
Tipe file.
Database dibentuk dari kumpulan file. File di dalam pemrosesan aplikasi dapat dikategorikan ke dalam beberapa tipe, diantaranya yaitu sebagai berikut :
1. File induk (master file).
Didalam aplikasi, file ini merupakan file yang penting. File ini tetap terus ada selama hidup dari sistem informasi. File induk dapat dibedakan lagi menjadi :
a. File induk acuan (reference master file), yaitu file induk yang recordnya relatif statis, jarang berubah nilainya. Contoh dari file ini adalah file daftar gaji, file daftar matakuliah.
b. File induk dinamik (dynamic master file), yaitu file induk yang nilai dari record-recordnya sering berubah atau sering dimutakhirkan (updated) sebagai akibat dari suatu transaksi. Contoh file ini adalah file induk persediaan, file induk langganan dan lain sebagainya.
2. File transaksi (transaction file).
File transaksi disebut juga dengan nama file input (input file). File ini digunakan untuk merekam data hasil dari suatu transaksi yang terjadi. Misalnya nilai unit suatu barang dapat diketahui dari file induk persediaan. File induk ini hanya menunjukkan status unit akhir dari barang yang dimaksud. Sedang uni akhir ini berasal dari transaksi- transaksi yang pernah terjadi. Untuk melihat transaksi-transaksi yang mempengaruhi nilai di file induk, maka dapat dilihat pada file transaksinya. Contoh file transaksi yang lain adalah file transaksi penjualan yang berisi data tentang transaksi penjualan yang terjadi. Biasanya file transaksi memuat rekaman tanggal dari transaksinya yang menunjukkan kapan transaksi tersebut terjadi.
3. File laporan (report file).
File ini disebut juga dengan file output (output file), yaitu file yang berisi dengan informasi yang akan ditampilkan. File ini dibuat untuk mempersiapkan pembuatan suatu laporan dan biasanya dilakukan bila printer belum siap atau masih digunakan oleh proses yang lain.
4. File sejarah (history file).
File sejarah dibuat judan dengan file arsip (archival file), yaitu file yang berisi dengan data masa lalu yang sudah tidak aktif lagi, tetapi perlu disimpan untuk keperluan mendatang.
5. File pelindung (backup file).
File pelindung merupakan salinan dari file-file yang masih aktif didatabase pada suatu saat tertentu. File ini digunakan sebagai cadangan atau pelindung bila file database yang aktif rusak atau hilang.
6. File kerja (working file).
File kerja disebut juga dengan nama file sementara (temprorary file) atau scratch file. File ini dibuat oleh suatu proses program secara sementara karena memori komputer tidak mecukupi atau untuk menghemat pemakaian memori selama proses dan akan dihapus bila proses telah selesai.
Akses dan organisasi file.
Akses file (access file) adalah suatu metode yang menunjukkan bagaimana suatu program komputer akan membaca record-record dari suatu file.File dapat diakses dengan dua cara yaitu secara urut (sequential access) atau secara langsung (direct access atau random access). Metode akses urut(sequential access method) dilakukan dengan membaca atau menulis suatu record di file dengan membaca terlebih dahulu mulai dari record pertama,urut sampai dengan record yang diinginkan. Metode akses langsung (direct access method) dilakukan dengan cara langung membaca record pada posisinya di file tanpa membaca dari record pertama terlebih dahulu. Organisasi file adalah pengaturan dari record secara logika didalam file dihubungkan satu dengan yang lainnya. File dapat diorganisasikan secara urut (sequential organization) atau secara acak (random organization). Walaupun organisasi file dan pengaksesan file dapat dipandang secara
terpisah, tetapi biasanya pembahasan mengenai organisasi file menyangkut
keduanya, yaitu sebagai berikut :
a. File urut (sequential file) merupakan file dengan organisasi urut
(sequential organization) dengan pengaksesan secara urut (sequential
access).
b. File urut berindeks (indexed sequential file) atau sering disebut dengan
ISAM (indexed sequential access method) merupakan file dengan organisasi
urut (sequential organization) dengan pengaksesan secara langsung
(direct access).
c. File akses langsung (direct access file) atau disebut dengan file alamat
langsung (direct address file) merupakan file dengan organisasi acak
(random organization) dengan pengaksesan langsung (direct access).
Organisasi file seperti ini disebut dengan organisasi file tradisional atau
konvensional, karena telah ada sebelum struktur database dikembangkan.
Organisasi file database dapat berbentuk struktur data berjenjang
(hierarchical data structure), struktur data jaringa (network data structure)
dan struktur data hubungan (relational data structure). Struktur data
hubungan merupakan organisasi file database yang terbaru dan mudah dipahami.
Struktur data hubungan mempunyai karakteristik sebagai berikut :
a. File dalam bentuk tabel yang persis dengan file urut.
b. Hubungan antara record didasarkan pada nilai dari field kunci, bukan
berdasarkan alamat atau pointer.
Struktur data hubungan makin banyak digunakan pada paket-paket DBMS, seperti
misalnya Dbase, Foxbase, Sql dan sebagainya.
Langkah-langkah desain database.
Untuk tahap desain database yang perlu dilakukan adalah mengidentifikasi terlebih dahulu file-file yang diperlukan dalam sistem informasi yang dibangun. File-fila database yang dibutuhkan oleh sistem dapat dilihat pada desain model yang digambarkan dalam bentuk diagram arus data (DFD).
Langkah-langkah desain database secara umum adalah sebagai berikut :
a. Menentukan kebutuhan file database untuk sistem yang baru.
File yang dibutuhkan dapat ditentukan dari DAD sistem baru yang telah dibuat.
b. Menentukan parameter daru file database.
Setelah file-file yang dibutuhkan telah dapat ditentukan, maka parameter dari file selanjutnya juga dapat ditentukan. Parameter tersebut, meliputi:
· Tipe dari file : file induk, file transaksi, file sementara (temporary).
· Media dari file : hardisk, disket, pita magnetik, CD.
· Organisasi dari file : fila sequential, random, berindek.
· Field kunci dari file.
Analis sistem dapat menggunakan formulir berikut untuk mengidentifikasi file database yang akan didesain, sebagai berikut :
Tabel 4.6 : Tabel identifikasi kebutuhan file.
Tabel 4.7 : Tabel identifikasi atribut (field) dalam sebuah file.
Hardware and Software Requirements for Installing SQL Server 2000
The minimum hardware and software requirements for running Microsoft® SQL Server™ 2000 are listed in the following tables.
Hardware Requirements
This table shows hardware requirements for installing Microsoft SQL Server 2000 or SQL Server client management tools and libraries.
Hardware Minimum requirements
Computer Intel® or compatible
Pentium 166 MHz or higher.
Memory (RAM)1 Enterprise Edition and Enterprise Evaluation Edition: 64 MB minimum, 128 MB or more recommended
Standard Edition and Developer Edition: 64 MB minimum
Personal Edition and Desktop Engine (MSDE 2000):
128 MB minimum on Windows XP
64 MB minimum on Windows 2000
32 MB minimum on all other operating systems
Hard disk space2 Enterprise, Enterprise Evaluation, Standard, Developer, and Personal Editions require:
95 to 270 MB of available hard disk space for the database engine; 250 MB for a typical installation.
50 MB of available hard disk space for a minimum installation of Analysis Services; 130 MB for a typical installation.
80 MB of available hard disk space for English Query.
Desktop Engine (MSDE 2000): 44 MB minimum
Monitor VGA or higher resolution
800x600 or higher resolution required for the SQL Server graphical tools
Pointing device Microsoft Mouse or compatible pointing device
CD-ROM drive Required
1 Additional memory may be required, depending on operating system requirements.
2 Actual requirements will vary based on your system configuration and the applications and features you choose to install.
Note Microsoft SQL Server 2000 does not have a hardware compatibility list (HCL). If your computer meets the minimum requirements listed in the preceding table, SQL Server 2000 software works on the hardware certified for use with the Microsoft Windows® operating system. For more information about hardware certified for use with the Windows operating system, see the Microsoft Windows Hardware Compatibility List at Microsoft Web site.
Operating System Requirements
This table shows the operating systems that must be installed to use the various editions or components of Microsoft SQL Server 2000.
SQL Server 2000 edition or component
Operating system requirement
Enterprise Edition and Standard Edition Windows Server 2003, Standard Edition1
Windows Server 2003, Enterprise Edition1
Windows Server 2003, Datacenter Edition1
Windows 2000 Server
Windows 2000 Advanced Server
Windows 2000 Datacenter Server
Microsoft Windows NT Server version 4.0 with Service Pack 5 (SP5) or later
Windows NT Server version 4.0, Enterprise Edition, with SP5 or later
Enterprise Evaluation Edition and Developer Edition Operating systems listed above for Enterprise and Standard Editions
Windows XP Professional
Windows XP Home Edition
Windows 2000 Professional
Windows NT Workstation 4.0 with SP5 or later
SQL Server 2000 Personal Edition2 and SQL Server 2000 Desktop Engine (MSDE 2000) Operating systems listed above for Enterprise, Standard, Enterprise Evaluation, and Developer Editions
Windows Server 2003, Web Edition1 (MSDE only)
Windows 98
Windows Me
Client Tools Support Windows 953
Windows 98
Windows Me
Windows NT Workstation 4.0
Windows 2000 Professional
Windows XP Professional
Windows XP Home Edition
Windows Server 2003, Standard Edition1
Windows Server 2003, Enterprise Edition1
Windows Server 2003, Datacenter Edition1
UNIX, Apple Macintosh, and OS/2 clients require Open Database Connectivity (ODBC) client software from a third-party vendor.
Connectivity Only Windows 95
Windows 98
Windows Me
Windows NT Workstation 4.0
Windows 2000 (all editions)
Windows XP Professional
Windows XP Home Edition
Windows Server 2003, Standard Edition1
Windows Server 2003, Enterprise Edition1
Windows Server 2003, Datacenter Edition1
1 Windows Server 2003 requires SQL Server 2000 Service Pack 3 or later to be applied.
2 SQL Server 2000 Personal Edition is offered for desktop and mobile use. Personal Edition does not contain the full functionality of Standard Edition. Analysis Services, including online analytical processing (OLAP), data mining, and other data warehousing features, are included in Personal Edition but cannot be installed on Windows 98 or Windows Me.
3 Supported for client connectivity only; does not include graphical tools support.
Note Microsoft Windows NT® Server 4.0, Service Pack 5 (SP5) or later must be installed as a minimum requirement for all SQL Server 2000 editions.
SQL Server 2000 is not supported on Windows NT 4.0 Terminal Server.
For installations of SQL Server 2000 Personal Edition or Desktop Engine (MSDE 2000) on Windows 98 computers without a network card, Windows 98 Second Edition is required.
Internet Requirements
This table shows Internet requirements related to using Microsoft SQL Server 2000.
Component Requirement
Internet software Microsoft Internet Explorer 5.0 is required for all installations of Microsoft SQL Server 2000, as it is required for Microsoft Management Console (MMC) and HTML Help. A minimal install is sufficient, and Internet Explorer is not required to be the default browser.
Exception to the Internet Explorer 5.0 requirement: If using the Connectivity Only option and not connecting to a server that requires encryption, Microsoft Internet Explorer 4.01 with Service Pack 2 is sufficient.
Internet Information Services If writing XML applications, see System Requirements for the IIS Virtual Directory Management for SQL Server Utility.
Network Software Requirements
Microsoft Windows XP, Windows 2000, Windows 98, Windows 95, Windows Me, and Windows NT have built-in network software. Additional network software is required only if you are using Banyan VINES or AppleTalk ADSP. Novel NetWare IPX/SPX client support is provided by the NWLink protocol of Windows-based networking.
Note TCP/IP must be enabled at the operating system level before installing SQL Server 2000. For more information, see Network Libraries.
Considerations for Other Microsoft Products
The following Microsoft products require Service Release or Service Packs to operate correctly with SQL Server 2000.
Access 2000
Microsoft Access 2000 requires the installation of either Microsoft Office 2000 Service Release 1 (SR1) or Access 2000 SR1 to operate correctly with SQL Server 2000. If running an earlier version of Access 2000, you cannot test automatic data processing (ADP) applications against SQL Server 2000. You cannot access database diagrams, stored procedures, table designs, or view designs.
Other issues to be addressed in a future Access Service Release:
When you run Access 2000 with SR1, you can test ADP applications. You can also alter database diagrams, stored procedures, table designs, or view designs, but you cannot save any changes. A future Access Service Release will allow limited ability to save changes.
The Access 2000 Create Database Wizard cannot successfully create a SQL Server 2000 database. You can work around this by first creating the database using SQL Server Enterprise Manager, and then creating an ADP for the database using the Project (Existing Database) option on the New dialog box in Access 2000.
The Access 2000 Upsizing Wizard does not support upsizing to SQL Server 2000. You can work around this by using Data Transformation Services in the Enterprise Manager to import your MDB database file into SQL Server. You can then rename your MDB tables and create linked tables to the resulting SQL Server database with the same names as your original MDB table names.
Visual Studio 6.0
When you run Microsoft Visual Studio® 6.0, you cannot access database diagrams, stored procedures, table designs, or view designs in SQL Server 2000. Visual Studio 6.0 Service Pack 4 allows you to alter database diagrams, stored procedures, table designs, or view designs, but you cannot save them. A future Visual Studio Service Pack will allow a limited ability to save changes.
Hardware Requirements
This table shows hardware requirements for installing Microsoft SQL Server 2000 or SQL Server client management tools and libraries.
Hardware Minimum requirements
Computer Intel® or compatible
Pentium 166 MHz or higher.
Memory (RAM)1 Enterprise Edition and Enterprise Evaluation Edition: 64 MB minimum, 128 MB or more recommended
Standard Edition and Developer Edition: 64 MB minimum
Personal Edition and Desktop Engine (MSDE 2000):
128 MB minimum on Windows XP
64 MB minimum on Windows 2000
32 MB minimum on all other operating systems
Hard disk space2 Enterprise, Enterprise Evaluation, Standard, Developer, and Personal Editions require:
95 to 270 MB of available hard disk space for the database engine; 250 MB for a typical installation.
50 MB of available hard disk space for a minimum installation of Analysis Services; 130 MB for a typical installation.
80 MB of available hard disk space for English Query.
Desktop Engine (MSDE 2000): 44 MB minimum
Monitor VGA or higher resolution
800x600 or higher resolution required for the SQL Server graphical tools
Pointing device Microsoft Mouse or compatible pointing device
CD-ROM drive Required
1 Additional memory may be required, depending on operating system requirements.
2 Actual requirements will vary based on your system configuration and the applications and features you choose to install.
Note Microsoft SQL Server 2000 does not have a hardware compatibility list (HCL). If your computer meets the minimum requirements listed in the preceding table, SQL Server 2000 software works on the hardware certified for use with the Microsoft Windows® operating system. For more information about hardware certified for use with the Windows operating system, see the Microsoft Windows Hardware Compatibility List at Microsoft Web site.
Operating System Requirements
This table shows the operating systems that must be installed to use the various editions or components of Microsoft SQL Server 2000.
SQL Server 2000 edition or component
Operating system requirement
Enterprise Edition and Standard Edition Windows Server 2003, Standard Edition1
Windows Server 2003, Enterprise Edition1
Windows Server 2003, Datacenter Edition1
Windows 2000 Server
Windows 2000 Advanced Server
Windows 2000 Datacenter Server
Microsoft Windows NT Server version 4.0 with Service Pack 5 (SP5) or later
Windows NT Server version 4.0, Enterprise Edition, with SP5 or later
Enterprise Evaluation Edition and Developer Edition Operating systems listed above for Enterprise and Standard Editions
Windows XP Professional
Windows XP Home Edition
Windows 2000 Professional
Windows NT Workstation 4.0 with SP5 or later
SQL Server 2000 Personal Edition2 and SQL Server 2000 Desktop Engine (MSDE 2000) Operating systems listed above for Enterprise, Standard, Enterprise Evaluation, and Developer Editions
Windows Server 2003, Web Edition1 (MSDE only)
Windows 98
Windows Me
Client Tools Support Windows 953
Windows 98
Windows Me
Windows NT Workstation 4.0
Windows 2000 Professional
Windows XP Professional
Windows XP Home Edition
Windows Server 2003, Standard Edition1
Windows Server 2003, Enterprise Edition1
Windows Server 2003, Datacenter Edition1
UNIX, Apple Macintosh, and OS/2 clients require Open Database Connectivity (ODBC) client software from a third-party vendor.
Connectivity Only Windows 95
Windows 98
Windows Me
Windows NT Workstation 4.0
Windows 2000 (all editions)
Windows XP Professional
Windows XP Home Edition
Windows Server 2003, Standard Edition1
Windows Server 2003, Enterprise Edition1
Windows Server 2003, Datacenter Edition1
1 Windows Server 2003 requires SQL Server 2000 Service Pack 3 or later to be applied.
2 SQL Server 2000 Personal Edition is offered for desktop and mobile use. Personal Edition does not contain the full functionality of Standard Edition. Analysis Services, including online analytical processing (OLAP), data mining, and other data warehousing features, are included in Personal Edition but cannot be installed on Windows 98 or Windows Me.
3 Supported for client connectivity only; does not include graphical tools support.
Note Microsoft Windows NT® Server 4.0, Service Pack 5 (SP5) or later must be installed as a minimum requirement for all SQL Server 2000 editions.
SQL Server 2000 is not supported on Windows NT 4.0 Terminal Server.
For installations of SQL Server 2000 Personal Edition or Desktop Engine (MSDE 2000) on Windows 98 computers without a network card, Windows 98 Second Edition is required.
Internet Requirements
This table shows Internet requirements related to using Microsoft SQL Server 2000.
Component Requirement
Internet software Microsoft Internet Explorer 5.0 is required for all installations of Microsoft SQL Server 2000, as it is required for Microsoft Management Console (MMC) and HTML Help. A minimal install is sufficient, and Internet Explorer is not required to be the default browser.
Exception to the Internet Explorer 5.0 requirement: If using the Connectivity Only option and not connecting to a server that requires encryption, Microsoft Internet Explorer 4.01 with Service Pack 2 is sufficient.
Internet Information Services If writing XML applications, see System Requirements for the IIS Virtual Directory Management for SQL Server Utility.
Network Software Requirements
Microsoft Windows XP, Windows 2000, Windows 98, Windows 95, Windows Me, and Windows NT have built-in network software. Additional network software is required only if you are using Banyan VINES or AppleTalk ADSP. Novel NetWare IPX/SPX client support is provided by the NWLink protocol of Windows-based networking.
Note TCP/IP must be enabled at the operating system level before installing SQL Server 2000. For more information, see Network Libraries.
Considerations for Other Microsoft Products
The following Microsoft products require Service Release or Service Packs to operate correctly with SQL Server 2000.
Access 2000
Microsoft Access 2000 requires the installation of either Microsoft Office 2000 Service Release 1 (SR1) or Access 2000 SR1 to operate correctly with SQL Server 2000. If running an earlier version of Access 2000, you cannot test automatic data processing (ADP) applications against SQL Server 2000. You cannot access database diagrams, stored procedures, table designs, or view designs.
Other issues to be addressed in a future Access Service Release:
When you run Access 2000 with SR1, you can test ADP applications. You can also alter database diagrams, stored procedures, table designs, or view designs, but you cannot save any changes. A future Access Service Release will allow limited ability to save changes.
The Access 2000 Create Database Wizard cannot successfully create a SQL Server 2000 database. You can work around this by first creating the database using SQL Server Enterprise Manager, and then creating an ADP for the database using the Project (Existing Database) option on the New dialog box in Access 2000.
The Access 2000 Upsizing Wizard does not support upsizing to SQL Server 2000. You can work around this by using Data Transformation Services in the Enterprise Manager to import your MDB database file into SQL Server. You can then rename your MDB tables and create linked tables to the resulting SQL Server database with the same names as your original MDB table names.
Visual Studio 6.0
When you run Microsoft Visual Studio® 6.0, you cannot access database diagrams, stored procedures, table designs, or view designs in SQL Server 2000. Visual Studio 6.0 Service Pack 4 allows you to alter database diagrams, stored procedures, table designs, or view designs, but you cannot save them. A future Visual Studio Service Pack will allow a limited ability to save changes.
Database Design by Yulius Eka Agung Seputra,ST,MSi
Part 1: An overview of index-tuning tools
As part of my current job at Microsoft, I work with SQL customers worldwide, ranging from fortune 500 financial institutions to online retailers to small partner outfits. I am engaged primarily to conduct performance tuning of their SQL Server systems.
When a system has a performance problem, and time is short (as it inevitably is), it's tempting just to "kill it with iron" – add more memory, more CPU horsepower, bigger and better disk subsystems, and so on. The problem is that if your design is fundamentally flawed then there is a strong chance that your application will not scale: doubling your horsepower will not double the performance. You might end up throwing a lot of hardware (and a lot of money) at the problem in order to reach your performance target.
Instead, your first port of call should be with the design of the application, and probably the "lowest-hanging fruit" here – the option that produces maximum gains with least impact on existing systems and processes – is to examine your indexing strategy.
When you run a query, SQL Server query optimizer's job is to find the best possible execution plan (based on internal costing algorithms) to provide the best response time. If the indexes required for implementing this optimal execution plan are not present, then the query optimizer is forced to choose a sub-optimal execution plan, for which indexes do exist. This could cause SQL Server to do more I/O and result in slower query performance. The need to have the right indexes in place cannot be overemphasized.
However, the task of identifying the right indexes is not necessarily straightforward. It requires a sound knowledge of the of the sort of queries that will be run against the data, the distribution of that data, and the volume of data, as well as an understanding of what type of index will suit your needs best. While this understanding is still essential, SQL Server 2005 does offer a helping hand in the form of some new tools – principally the Database Engine Tuning Advisor – that can help you determine, tune and monitor your indexes.
In this article, I will demonstrate how to use these tools to get answers to the following questions:
Which indexes do I need for my queries?
How do I monitor index usage and their effectiveness?
How do I identify redundant indexes that could negatively impact performance of my DML queries (insert, updates and deletes)
As workload changes, how I do I identify any missing indexes that could enhance performance for my new queries?
Finding the right indexes for your workload
Determining exactly the right indexes for your system can be quite a taxing process. For example, you have to consider:
Which columns should be indexed (based on your knowledge of how the data is queried)
Whether to choose a single-column index or a multiple column index
Whether you need a clustered index or a non-clustered index
Whether or not (in SQL 2005) you could benefit from an index with included columns to avoid bookmark lookups
How to utilize indexed views (which the optimizer might access instead of the underlying tables to retrieve a subset of your data)
All of these options increase the complexity of identifying the right indexes for your system. You might be tempted to drop indexes in everywhere "just in case", but one has to remember that with the introduction of every new index for improving the performance of the SELECT workload, the query optimizer has to do more work when running DML statements (insert, update or delete), as now it has to reorganize this index to accommodate for the newly updated data in the table. There is a balance to be struck here.
Even once you've determined the prefect set of indexes, your job is not finished. Your workload will change over time (new queries are will be added, older ones removed from the application) and this might warrant revisiting existing indexes, analyzing their usage and making adjustments (modifying/dropping existing indexes or creating new ones). Maintenance of indexes is critical to ensuring optimal performance in the long run.
SQL 2005 provides the following tools and features to help you find the right indexes for your queries, and then monitor and tune them:
Database Engine Tuning Advisor (DTA)
STATISTICS XML output
Dynamic Management Views (DMVs)
I will describe each one of them in the subsequent sections.
Using Database Engine Tuning Advisor (DTA)
The DTA is a new tool in SQL 2005 which replaces the Index Tuning Wizard in earlier versions of SQL Server. DTA can analyze both OLTP and OLAP workloads. You can either tune a single query or the entire workload to which your server is subjected. Based on the options that you select, you can use the DTA to make recommendations for several Physical Design Structures (PDS), which include:
Clustered indexes
Non-clustered indexes
Indexes with included columns (to avoid bookmark lookups)
Indexed views
Partitions
Let's walk through the steps involved in using DTA effectively.
Generating the DTA workload
The first step is to collect a workload for DTA to analyze. You can do this one of two ways:
Using Management Studio – if you need to optimize the performance of a single query, you can use Management Studio to provide an input to DTA. Simply type the query in Management Studio, highlight it and then right click on it to choose Analyze in Database Engine Tuning Advisor. This will launch DTA with your query as the input.
Using Profiler – if you want to want to determine the optimum index set for the entire workload of your server, you should collect a profiler trace using the built-in TUNING template.
To unleash the true effectiveness of DTA, you should always use a representative profiler trace. Make sure that you subject your server to all the queries that will typically be run against the data, while you are collecting the trace. This could lead to a huge trace file, but that is normal. If you simply collect a profiler trace over a 5-10 minute period, you can be pretty sure it won't be truly representative of all the queries run against your database.
The TUNING template captures only minimal events, so there should not be any significant performance impact on your server.
As part of the code download for this article, I've provided a profiler trace (Workload2Analyze.trc) that was captured with the TUNING template while running numerous queries ( Queries4Workload.sql) against the AdventureWorks database. I would strongly recommend that you use this trace, or the one from your server, to get a hands-on perspective of DTA.
Consuming the workload using DTA
Having collected a truly representative profiler trace, you can use it as an input to DTA, which will then generate recommendations. You can perform one of the following two types of analysis.
SCENARIO I: Keep my existing PDS and tell me what else I am missing
This type of analysis is common and is useful if you have previously established the set of indexes that you deem to be most useful for your given workload, and are seeking further recommendations. To conduct this analysis:
Launch a new session in DTA.
Choose the profiler trace as the input to this session.
In the Select databases and tables to tune section select your target database (AdventureWorks, if you are using my script).
In the Database for workload analysis dropdown, you can either use the same database or Tempdb (or any other database). I generally use Tempdb for his purpose.
At the Tuning Options tab, select the following options:
Physical Design Structures to use in database -> Indexes and Indexed views
Physical Design Structures to keep in database -> Keep all existing PDS
Uncheck the checkbox for limit tuning time.
Hit START ANALYSIS and DTA will start consuming your workload.
Once DTA is finished consuming the workload, it will list all its recommendations, under the Recommendations tab. We will talk more about reading and implementing DTA's recommendations in the section Reading output from DTA.
SCENARIO II: Ignore my existing PDS and tell me what query optimizer needs
In scenario I, DTA makes recommendations for any missing indexes. However, this doesn't necessarily mean your existing indexes are optimal for the query optimizer. You may also consider conducting an analysis whereby DTA ignores all existing physical design structures and recommends what it deems the best possible set of PDS for the given workload. In this way, you can validate your assumptions about what indexes are required.
To conduct this analysis, follow steps 1 to 6 as above, except that at step 5b, choose "Do not keep any existing PDS".
Contrary to how this might sound, DTA will not actually drop or delete any existing PDSs. This is the biggest advantage of using DTA, as it means you can use the tool to perform what-if analysis without actually introducing any changes to the underlying schema.
A note on the DTA Tuning Log
As it consumes the workload, DTA will store in the DTA Tuning log (found under the Progress tab, in the bottom section) details of any errors it encounters.
Typically, you might see a message of the form "XX% of the consumed workload has syntax errors". The tuning log will provide further details of these errors, many of which can be safely ignored. For example:
Statement doesn't reference any table – statements such as SET or DECLARE in your workload will cause these errors, and they can generally be ignored
Statement references only small tables – DTA will not tune a query if it references a small table (10 data pages or less)
Incorrect syntax or object related errors – if these errors indicate queries statments containing keywords such as BEGIN, TRY, INSERTED then you can ignore them.
To learn more about how to analyze the tuning log, refer to About the Tuning Log and Determining Whether Events Can Be Tuned on BOL
Reading output from DTA
After consuming the workload, DTA presents, under the Recommendations tab, a set of recommendations for tuning your PDS. I tend to focus on the following sections:
Recommendation – this is the action that you need to take. Possible values include Create or Drop.
Target of Recommendation – this is the proposed name of the PDS to be created. The naming convention is typical of DTA and generally starts with _dta*. However, I recommend that you change this name based on the naming convention in your database.
Definition – this is the list of columns that this new PDS will include. If you click on the hyperlink, it will open up a new window with the T-SQL script to implement this recommendation.
Estimated Improvements – this is the estimated percentage improvement that you can expect in your workload performance, if you implement all the recommendations made by DTA.
Space used by recommendation (MB) – under the Tuning Summary section of the Reports tab, you can find out the extra space in MB that you would need, if you decide to implement these recommendations.
Inbuilt analysis reports
There are 15 inbuilt reports under the Reports tab. However, the following three reports are the most important.
NOTE:
For information on these and the other 12 reports, visit the article Choosing a Database Engine Tuning Advisor Report on MSDN.
Index Usage Report (current)
Start with this report to see how your existing indexes are being used by the queries running against your server. Each index that has been used by a query is listed here. Each referenced index has a Percent Usage value which indicates the percentage of statements in your workload that referenced this index.
If an index is not listed here, it means that it has not been used by any query in your workload.
TIP: identifying indexes that are not used
If you are certain that all the queries that run against your server have been captured by your profiler trace, then you can use this report to identify indexes that are not required and possibly delete them.
Index Usage Report (recommended)
Next, look at this report to identify how index usage will change if the recommended indexes are implemented. If you compare these two reports, you will see that the index usage of some of the current indexes has fallen while some new indexes have been included with a higher usage percentage, indicating a different execution plan for your workload and improved performance.
Statement Cost Report
This report lists individual statements in your workload and the estimated performance improvement for each one of them. Using this report, you can identify your poorly performing queries and see the sort of improvement you can expect if you implement the recommendations made by DTA.
You will find that some statements don't have any improvements (Percent improvement = 0). This is because either the statement was not tuned for some reason or it already has all the indexes that it needs to perform optimally.
Implementing DTA's recommendations
By now, we have collected a workload using Profiler, consumed it using DTA and got a set of recommendations to improve performance. You then have the choice to either:
Save recommendations – you can save the recommendations in an SQL script by navigating ACTIONS | SAVE RECOMMENDATIONS. You can then manually run the script in Management Studio to create all the recommended PDS.
Apply recommendations using DTA – if you are happy with the set of recommendations then simply navigate ACTIONS | APPLY RECOMMENDATIONS. You can also schedule a later time to apply these recommendations (during off-peak hours, for example)
More often than not I use the Save Recommendations option. It means that I can easily compare the resulting PDS set for several scenarios. The recommendation reports are also useful for future reference. As the workload on the server changes, you need to re-evaluate the index usage and make corrections (drop redundant indexes or create new ones). You can compare the index usage on a periodic basis (may be quarterly or so) and see how the usage is changing.
Performing what-if analysis using DTA
This is a very cool feature of DTA. Suppose you don't want to apply all the recommendations that DTA provided. However, since the Estimated Improvement value can only be achieved if you apply all of these recommendations together, you are not really sure what kind of an impact it will have if you only choose to apply a sub-set of these recommendations.
Deselect the recommendations that you don't want to apply. Now, go to ACTIONS | EVALUATE RECOMMENDATIONS.
This will launch another session with the same options as the earlier one. However, now when you click on START ANALYSIS, DTA will provide data on estimated performance improvements, based on just this sub-set of the recommendations.
Again, the key thing to remember is that DTA performs this "what-if" analysis without actually implementing anything in the database.
TIP: checking for redundant objects left behind by DTA
In theory, DTA should not leave any extra "hypotetical" PDS (indexes etc.) in your database, after the analysis is complete. However, you can easily check for this by running the following query:
select name from sys.indexes where is_hypothetical = 1Using STATISTICS XMLIf you just wish to determine the optimum index set for a particular query, then the STATISTICS XML option provides a quick alternative to DTA.When a query is executed with STATISTICS XML enabled, SQL Server not only returns the query result-set, but also executes T-SQL statements that retrieve the execution plan for the query, in the form of an XML file (Showplan.xml). For example, try executing the following SQL statement:SET STATISTICS XML ONGOUse AdventureworksSELECT CustomerID, PurchaseOrderNumber, TotalDueFROM Sales.SalesOrderHeaderWHERE ShipMethodID > 2AND TotalDue > 200.00AND TerritoryID = 3;GOSET STATISTICS XML OFFGOBelow the query results, you should see a link to an XML file. The most important elements to look for in the XML file are: - the value of the StatementText will be the query that you just ran. You can confirm that this matches the query that you ran in Management Studio. - details of any missing indexes that the optimizer believes would improve the performance of this query are logged here For the above query, the relevant section in your XML file might look something like this: It indicates that we have one missing index on the Sales.SalesOrderHeader table. Look for the value of Impact in the element, which estimates the percentage increase in response time you can expect to see if you implement the set of indexes highlighted in this XML.Within the element: lists index columns that are used for equality in the query ( TerritortyID, in our case). Of all the conditions specified in the query, the query optimizer first looks for an index on the equality column in order to evaluate the rows that meet the criteria lists index columns that are used to evaluate inequality conditions in the query ( ShipMethodID and TotalDue, in this case) lists index columns that are included to cover the query ( PurchaseOrderNumber and CustomerID). You can specify included columns for non-clustered indexes to avoid a bookmark lookup in your execution plan NOTE:To find out more about included columns on non-clustered indexes please refer to the Index with Included Columns article on MSDN.To create this missing index, you should use the following template:CREATE NONCLUSTERED INDEX <>ON <> (equality_columns, followed by inequality_columns)INCLUDE (included_columns);So to create the missing index in this case, the command is as follows:CREATE NONCLUSTERED INDEX TestIndex ON Sales.SalesOrderHeader (TerritoryID, ShipMethodID, TotalDue) INCLUDE (PurchaseOrderNumber, CustomerID);Using index-related Dynamic Management Views (DMVs)Dynamic Management Views are a new feature of SQL 2005. They maintain server-wide state and configuration information for your SQL Server installation. There are more than 50 DMVs in SQL 2005 and the information stored in them can be used to monitor the health of a server instance, diagnose problems, and tune performance. There are several sets of DMVs, each of which store particular server information. One such set of DMVs store information about indexes and their usage. If the query optimizer finds that the indexes required for implementing a particular execution plan are not present, it will log the information about these missing indexes in the index-related DMVs. This information is persisted in the DMVs, until the next restart of SQL Server or when the metadata associated with the objects gets dropped.You can query these DMVs to find the missing indexes or store the information for periodic analysis. Let's take a look at how you can use the information in the DMVs and how you can interpret the output to create the requisite indexes. Before we begin, go ahead and restart your SQL Server on your test machine so that the information is cleared in each DMV. Next, in a new query window, execute the same query as in the previous section, but with the STATISTICS XML option turned off:SELECT CustomerID, PurchaseOrderNumber, TotalDueFROM Sales.SalesOrderHeaderWHERE ShipMethodID > 2AND TotalDue > 200.00AND TerritoryID = 3;The query optimizer will kick in to optimize the query and find the best execution plan. We already know that we have a missing index, so the query optimizer will settle for a sub-optimal plan. However, it will make a note about this in the relevant index related DMV.To retrieve the missing index details, simply query the relevant DMV, as follows:select * from sys.dm_db_missing_index_detailsThis query returns the id of the object on which you should create the index, which happens to be Sales.SalesOrderHeader along with the equality_columns, inequality_columns and included_columns, which are as described previously. As you will see, the DMV recommends creating the same "missing" index as the STATISTICS XML option and the syntax for creating this index is also the same as described above.Every time you restart SQL Server, these DMVs are flushed out. So you might want to copy the values in these DMVs into some other tables in your database to conduct periodic analysis.Using the right tool for the jobThe STATISTICS XML option and the index-related DMVs are both great tools for assessing the impact of a particular query on your indexing requirements. The biggest limitation in each case is that it cannot be used to fine tune the entire workload on your server. If you are running mission critical applications which involve a high number of queries, this approach will be very time and labor intensive. Furthermore, when using the DMVs, you can only access raw information about columns on which indexes might be missing. This approach also has several limitations and is less accurate for queries involving inequality predicates. For more details on the limitations, refer to the Limitations for Using the Missing Indexes Feature article on BOL.Database Engine Tuning Advisor is a more sophisticated and advanced tool, and provides more options than any of the other methods describes in this article. While using DTA, you can use a representative workload and, as a result, the recommendations made by DTA are more accurate. DTA can also be used against SQL 2000, while the other features cannot. Plus, you can perform what-if analysis using DTA without having to introducing any changes in your database. As a best practice recommendation, I would strongly advocate that you use DMVs or STATISTICS XML option only as a means to quickly assess the impact of a query on indexing requirements. You should always validate your findings, and the estimated improvements, using Database Engine Tuning Advisor, with a representative workload.ConclusionIn this first installment, I essentially provided an overview of how to use the different tools and features in SQL 2005 to determine the optimal set of indexes for the workload on your server.My personal preference is to use Database Engine Tuning Advisor. To get the most effective recommendations, you should capture a representative workload of your server using the TUNING template and use it as an input to the DTA.In the second part of this article, I will delve deeper into how to measure and monitor the usage and effectiveness of your existing indexes using DTA and DMVs. I'll also cover some more advanced "what-if" analysis using DTA, and some best practice for its general usage.
As part of my current job at Microsoft, I work with SQL customers worldwide, ranging from fortune 500 financial institutions to online retailers to small partner outfits. I am engaged primarily to conduct performance tuning of their SQL Server systems.
When a system has a performance problem, and time is short (as it inevitably is), it's tempting just to "kill it with iron" – add more memory, more CPU horsepower, bigger and better disk subsystems, and so on. The problem is that if your design is fundamentally flawed then there is a strong chance that your application will not scale: doubling your horsepower will not double the performance. You might end up throwing a lot of hardware (and a lot of money) at the problem in order to reach your performance target.
Instead, your first port of call should be with the design of the application, and probably the "lowest-hanging fruit" here – the option that produces maximum gains with least impact on existing systems and processes – is to examine your indexing strategy.
When you run a query, SQL Server query optimizer's job is to find the best possible execution plan (based on internal costing algorithms) to provide the best response time. If the indexes required for implementing this optimal execution plan are not present, then the query optimizer is forced to choose a sub-optimal execution plan, for which indexes do exist. This could cause SQL Server to do more I/O and result in slower query performance. The need to have the right indexes in place cannot be overemphasized.
However, the task of identifying the right indexes is not necessarily straightforward. It requires a sound knowledge of the of the sort of queries that will be run against the data, the distribution of that data, and the volume of data, as well as an understanding of what type of index will suit your needs best. While this understanding is still essential, SQL Server 2005 does offer a helping hand in the form of some new tools – principally the Database Engine Tuning Advisor – that can help you determine, tune and monitor your indexes.
In this article, I will demonstrate how to use these tools to get answers to the following questions:
Which indexes do I need for my queries?
How do I monitor index usage and their effectiveness?
How do I identify redundant indexes that could negatively impact performance of my DML queries (insert, updates and deletes)
As workload changes, how I do I identify any missing indexes that could enhance performance for my new queries?
Finding the right indexes for your workload
Determining exactly the right indexes for your system can be quite a taxing process. For example, you have to consider:
Which columns should be indexed (based on your knowledge of how the data is queried)
Whether to choose a single-column index or a multiple column index
Whether you need a clustered index or a non-clustered index
Whether or not (in SQL 2005) you could benefit from an index with included columns to avoid bookmark lookups
How to utilize indexed views (which the optimizer might access instead of the underlying tables to retrieve a subset of your data)
All of these options increase the complexity of identifying the right indexes for your system. You might be tempted to drop indexes in everywhere "just in case", but one has to remember that with the introduction of every new index for improving the performance of the SELECT workload, the query optimizer has to do more work when running DML statements (insert, update or delete), as now it has to reorganize this index to accommodate for the newly updated data in the table. There is a balance to be struck here.
Even once you've determined the prefect set of indexes, your job is not finished. Your workload will change over time (new queries are will be added, older ones removed from the application) and this might warrant revisiting existing indexes, analyzing their usage and making adjustments (modifying/dropping existing indexes or creating new ones). Maintenance of indexes is critical to ensuring optimal performance in the long run.
SQL 2005 provides the following tools and features to help you find the right indexes for your queries, and then monitor and tune them:
Database Engine Tuning Advisor (DTA)
STATISTICS XML output
Dynamic Management Views (DMVs)
I will describe each one of them in the subsequent sections.
Using Database Engine Tuning Advisor (DTA)
The DTA is a new tool in SQL 2005 which replaces the Index Tuning Wizard in earlier versions of SQL Server. DTA can analyze both OLTP and OLAP workloads. You can either tune a single query or the entire workload to which your server is subjected. Based on the options that you select, you can use the DTA to make recommendations for several Physical Design Structures (PDS), which include:
Clustered indexes
Non-clustered indexes
Indexes with included columns (to avoid bookmark lookups)
Indexed views
Partitions
Let's walk through the steps involved in using DTA effectively.
Generating the DTA workload
The first step is to collect a workload for DTA to analyze. You can do this one of two ways:
Using Management Studio – if you need to optimize the performance of a single query, you can use Management Studio to provide an input to DTA. Simply type the query in Management Studio, highlight it and then right click on it to choose Analyze in Database Engine Tuning Advisor. This will launch DTA with your query as the input.
Using Profiler – if you want to want to determine the optimum index set for the entire workload of your server, you should collect a profiler trace using the built-in TUNING template.
To unleash the true effectiveness of DTA, you should always use a representative profiler trace. Make sure that you subject your server to all the queries that will typically be run against the data, while you are collecting the trace. This could lead to a huge trace file, but that is normal. If you simply collect a profiler trace over a 5-10 minute period, you can be pretty sure it won't be truly representative of all the queries run against your database.
The TUNING template captures only minimal events, so there should not be any significant performance impact on your server.
As part of the code download for this article, I've provided a profiler trace (Workload2Analyze.trc) that was captured with the TUNING template while running numerous queries ( Queries4Workload.sql) against the AdventureWorks database. I would strongly recommend that you use this trace, or the one from your server, to get a hands-on perspective of DTA.
Consuming the workload using DTA
Having collected a truly representative profiler trace, you can use it as an input to DTA, which will then generate recommendations. You can perform one of the following two types of analysis.
SCENARIO I: Keep my existing PDS and tell me what else I am missing
This type of analysis is common and is useful if you have previously established the set of indexes that you deem to be most useful for your given workload, and are seeking further recommendations. To conduct this analysis:
Launch a new session in DTA.
Choose the profiler trace as the input to this session.
In the Select databases and tables to tune section select your target database (AdventureWorks, if you are using my script).
In the Database for workload analysis dropdown, you can either use the same database or Tempdb (or any other database). I generally use Tempdb for his purpose.
At the Tuning Options tab, select the following options:
Physical Design Structures to use in database -> Indexes and Indexed views
Physical Design Structures to keep in database -> Keep all existing PDS
Uncheck the checkbox for limit tuning time.
Hit START ANALYSIS and DTA will start consuming your workload.
Once DTA is finished consuming the workload, it will list all its recommendations, under the Recommendations tab. We will talk more about reading and implementing DTA's recommendations in the section Reading output from DTA.
SCENARIO II: Ignore my existing PDS and tell me what query optimizer needs
In scenario I, DTA makes recommendations for any missing indexes. However, this doesn't necessarily mean your existing indexes are optimal for the query optimizer. You may also consider conducting an analysis whereby DTA ignores all existing physical design structures and recommends what it deems the best possible set of PDS for the given workload. In this way, you can validate your assumptions about what indexes are required.
To conduct this analysis, follow steps 1 to 6 as above, except that at step 5b, choose "Do not keep any existing PDS".
Contrary to how this might sound, DTA will not actually drop or delete any existing PDSs. This is the biggest advantage of using DTA, as it means you can use the tool to perform what-if analysis without actually introducing any changes to the underlying schema.
A note on the DTA Tuning Log
As it consumes the workload, DTA will store in the DTA Tuning log (found under the Progress tab, in the bottom section) details of any errors it encounters.
Typically, you might see a message of the form "XX% of the consumed workload has syntax errors". The tuning log will provide further details of these errors, many of which can be safely ignored. For example:
Statement doesn't reference any table – statements such as SET or DECLARE in your workload will cause these errors, and they can generally be ignored
Statement references only small tables – DTA will not tune a query if it references a small table (10 data pages or less)
Incorrect syntax or object related errors – if these errors indicate queries statments containing keywords such as BEGIN, TRY, INSERTED then you can ignore them.
To learn more about how to analyze the tuning log, refer to About the Tuning Log and Determining Whether Events Can Be Tuned on BOL
Reading output from DTA
After consuming the workload, DTA presents, under the Recommendations tab, a set of recommendations for tuning your PDS. I tend to focus on the following sections:
Recommendation – this is the action that you need to take. Possible values include Create or Drop.
Target of Recommendation – this is the proposed name of the PDS to be created. The naming convention is typical of DTA and generally starts with _dta*. However, I recommend that you change this name based on the naming convention in your database.
Definition – this is the list of columns that this new PDS will include. If you click on the hyperlink, it will open up a new window with the T-SQL script to implement this recommendation.
Estimated Improvements – this is the estimated percentage improvement that you can expect in your workload performance, if you implement all the recommendations made by DTA.
Space used by recommendation (MB) – under the Tuning Summary section of the Reports tab, you can find out the extra space in MB that you would need, if you decide to implement these recommendations.
Inbuilt analysis reports
There are 15 inbuilt reports under the Reports tab. However, the following three reports are the most important.
NOTE:
For information on these and the other 12 reports, visit the article Choosing a Database Engine Tuning Advisor Report on MSDN.
Index Usage Report (current)
Start with this report to see how your existing indexes are being used by the queries running against your server. Each index that has been used by a query is listed here. Each referenced index has a Percent Usage value which indicates the percentage of statements in your workload that referenced this index.
If an index is not listed here, it means that it has not been used by any query in your workload.
TIP: identifying indexes that are not used
If you are certain that all the queries that run against your server have been captured by your profiler trace, then you can use this report to identify indexes that are not required and possibly delete them.
Index Usage Report (recommended)
Next, look at this report to identify how index usage will change if the recommended indexes are implemented. If you compare these two reports, you will see that the index usage of some of the current indexes has fallen while some new indexes have been included with a higher usage percentage, indicating a different execution plan for your workload and improved performance.
Statement Cost Report
This report lists individual statements in your workload and the estimated performance improvement for each one of them. Using this report, you can identify your poorly performing queries and see the sort of improvement you can expect if you implement the recommendations made by DTA.
You will find that some statements don't have any improvements (Percent improvement = 0). This is because either the statement was not tuned for some reason or it already has all the indexes that it needs to perform optimally.
Implementing DTA's recommendations
By now, we have collected a workload using Profiler, consumed it using DTA and got a set of recommendations to improve performance. You then have the choice to either:
Save recommendations – you can save the recommendations in an SQL script by navigating ACTIONS | SAVE RECOMMENDATIONS. You can then manually run the script in Management Studio to create all the recommended PDS.
Apply recommendations using DTA – if you are happy with the set of recommendations then simply navigate ACTIONS | APPLY RECOMMENDATIONS. You can also schedule a later time to apply these recommendations (during off-peak hours, for example)
More often than not I use the Save Recommendations option. It means that I can easily compare the resulting PDS set for several scenarios. The recommendation reports are also useful for future reference. As the workload on the server changes, you need to re-evaluate the index usage and make corrections (drop redundant indexes or create new ones). You can compare the index usage on a periodic basis (may be quarterly or so) and see how the usage is changing.
Performing what-if analysis using DTA
This is a very cool feature of DTA. Suppose you don't want to apply all the recommendations that DTA provided. However, since the Estimated Improvement value can only be achieved if you apply all of these recommendations together, you are not really sure what kind of an impact it will have if you only choose to apply a sub-set of these recommendations.
Deselect the recommendations that you don't want to apply. Now, go to ACTIONS | EVALUATE RECOMMENDATIONS.
This will launch another session with the same options as the earlier one. However, now when you click on START ANALYSIS, DTA will provide data on estimated performance improvements, based on just this sub-set of the recommendations.
Again, the key thing to remember is that DTA performs this "what-if" analysis without actually implementing anything in the database.
TIP: checking for redundant objects left behind by DTA
In theory, DTA should not leave any extra "hypotetical" PDS (indexes etc.) in your database, after the analysis is complete. However, you can easily check for this by running the following query:
select name from sys.indexes where is_hypothetical = 1Using STATISTICS XMLIf you just wish to determine the optimum index set for a particular query, then the STATISTICS XML option provides a quick alternative to DTA.When a query is executed with STATISTICS XML enabled, SQL Server not only returns the query result-set, but also executes T-SQL statements that retrieve the execution plan for the query, in the form of an XML file (Showplan.xml). For example, try executing the following SQL statement:SET STATISTICS XML ONGOUse AdventureworksSELECT CustomerID, PurchaseOrderNumber, TotalDueFROM Sales.SalesOrderHeaderWHERE ShipMethodID > 2AND TotalDue > 200.00AND TerritoryID = 3;GOSET STATISTICS XML OFFGOBelow the query results, you should see a link to an XML file. The most important elements to look for in the XML file are:
Langganan:
Postingan (Atom)