02/20/2018
by Marlon Ribunal
0 comments

Killing A Service Broker SPID

Killing a SPID shouldn’t be complicated. Execute the command KILL [SPID] and that should kill the session. But there are situations where a SPID that is spawned by the Service Broker stays alive even after the queue is stuck in limbo.

No matter how many times you kill the SPID, it will find a way to re-connect and will continue to sap your CPU. You would think that a server reboot would kill the SPID as the reboot will restart the service. But, no, the SPID will restart again. Setting the database to SINGLE_USER? Nope. That does not work either.

Ok, this is more of a note to self than a real blog post.

Here’s a quick way to KILL a SPID spawned by a Service Broker queue.

Run sp_whoisactive to check the program_name of the running SPID. The program name should give you the identification of the running process.

Find Queue Name

Then, get all the Conversation_Handle of the Queue

That gives us all the Conversation_Handle of the Queue, then we can then End the conversations, like,

Now we can go back to the active tasks of the Queue to get all the SPIDs

Then, based on that list, we can then issue the KILL command.

Again, you cannot just KILL SPID or end any Service Broker conversation willy-nilly. You have to do your due diligence before deciding to kill the SPIDs or end the Service Broker tasks.

Here’s a list of useful DM and Catalog Views that are related to Service Broker:

You can do something like this to get more info about the queues and the tasks or processes associated with them

02/13/2018
by Marlon Ribunal
0 comments

Database Maintenance Without Breaking The Bank

Database maintenance does not have to be expensive. There are free tools out there that will make your life easier. Of course, commercial DBA tools are worth their price – they are paid for a reason. Software like SolarWinds Database Performance Analyzer and Redgate SQL Monitor are two examples of SQL Server tools that can make anyone a better database steward.

If you do not have access to these tools, that doesn’t mean you’re stuck with your homebrewed solutions. If you are on a shoestring budget and commercial tools are not included in your annual budget allocation, the following free tools can help you survive your day-to-day job.

Glenn Berry’s Diagnostic Queries

If you want to know more about the configuration of your SQL Server instance, these are the queries you need. It’s not just configuration but you can also glean the overall health of your SQL Server from the various information that these queries provide.

The queries provide the following basic configuration information:

  • SQL Server Version, Update Level, and OS Information
  • CPU information
  • Global Trace Flags
  • Memory
  • SQL Server Agent Alert Info
  • Volume Info for all LUNs
  • Drive Latency Info
  • Etc.

It also provides performance related info:

  • IO Requests
  • Missing Indexes
  • CPU Utilization
  • Buffer Usage
  • Wait Stats
  • Etc.

Ola Hallengren SQL Server Maintenance Solution

Just like Glenn Berry’s Diagnostic Scripts, Ola Hallengren’s solution is a group of queries “for running backups, integrity checks, and index and statistics maintenance.” If you have a big daily job that gets slower and slower over time, I suggest that you set up the Index and Statics maintenance to run on regular schedule.

Here’s a typical command to execute;

You can also set up a SQL Server Integrity Check for the databases, tables, filegroups, indexed views, etc.

Brent Ozar’s sp_Blitz

The sp_Blitz suite has become a day-to-day tool for me. Whether you’re checking the overall health of your SQL Server (sp_Blitz) or maintaining your Indexes (sp_BlitzIndex), this sp_Blitz has got you covered. Want to know the worst-performing stored procedure? sp_BlitzCache can help you with that. Is your SQL Server slow? Let’s do sp_BlitzFirst on that. sp_Blitz seems to have covered all the important parts of database maintenance.

Adam Machanic’s sp_whoisactive

If we talk about helpful tools on SQL Server or ask data pros about their favorite tools, I’m sure the sp_whoisactive stored procedure comes up in the conversation. In fact, it is part of my heuristic approach to performance tuning, which you can read about here.

Paul Randal’s Wait Stats Scripts

There are great scripts on Paul Randal’s blog but the Waits Stats scripts have got to be my favorite. Waits is probably part of everybody’s performance tuning method. Waiting tasks can reveal the pain points of your SQL Server.

SentryOne Plan Explorer

Who still uses the built-in SSMS Execution Plan viewer nowadays? If you do a lot of query optimization or troubleshooting, this is the most important tool you need. There is just a lot of features in this tool that you can’t find in the SSMS Execution Plan Viewer. No database maintenance tools stash is complete without Plan Explorer.

To sum up, you don’t need to shell substantial amount of money to have a decent set of tools for database maintenance. There are lots of great tools like the ones listed above. What is your favorite tool? Share it in the comments below.

 

02/06/2018
by Marlon Ribunal
0 comments

Waiting Tasks Reveal Pain Points

Waiting tasks quickly shows you where to look for the bottlenecks. This is only one of the many times that I didn’t have to go beyond identifying the waiting tasks to quickly troubleshoot a production performance issue.

I have encountered a case where an application filter got stuck in loading status and not returning any data (well, it’s stuck). This was impacting the overall performance of the application. Normally, I would start with my simple heuristic troubleshooting method to get an idea of what’s going on in the backend. But since the issue is quite specific and the application module is identified in this case, I can fairly conclude that the reason for the stuck procedure is due to requests waiting for resources or whatever they are waiting for.

Waiting Tasks

Thanks to Paul Randal (B | T) for his sys.dm_os_waiting_tasks script. Running that script revealed something like the Waiting Tasks you see below (snapshot only, showed different wait time, status and waits at different points):

sqlserver waiting tasks

Paul was even kind enough to share a quick insight into what could possibly be causing all these locks:

Symptoms

So, that’s the first thing that I checked. Using Paul’s comprehensive SQL Server Wait Types Library, I looked for the definition of the waits that popped out:

IO_COMPLETION – “This wait type represents a variety of synchronous read and write operations in data files are not related to tables, plus reads from the transaction log.” From the snapshot returned by the query, I couldn’t really make anything out of this wait. From what I saw, this wait is incurring lots of reads and CPU resources. The SPID was running for more than 4 hours at this point. I looked up for more info on this wait, and it usually occurs in any of the following situations: Long-running I/O-bound operations, including BACKUP, ALTER DATABASE, CREATE DATABASE or database auto-growth. There were no such database maintenance operations within that timeframe though.

What the Waiting Tasks revealed, though, was that the top SPID wasn’t stuck. The task was still running but it’s just taking time and, therefore, taking substantial resources with it. At the rate it was going, it could have run for another 4-6 hours which could bring the SQL Server to its knees and render the application unusable from that point on.

LCK_M_SCH_S – “This wait type is when a thread is waiting to acquire a Schema Stability (also called Schema Share) lock on a resource and there is at least one other lock in an incompatible mode granted on the resource to a different thread.” This wait on the SPID shown in the query snapshot is quite obvious. This is a lock caused by the SPID on top of the chain.

LCK_M_IS – “This wait type is when a thread is waiting to acquire an Intent Shared lock on a resource and there is at least one other lock in an incompatible mode granted on the resource to a different thread.” Again, just like the above lock wait, this is caused by what’s on top of the blocking chain.

Findings

So what could have possibly caused the above waits? The primary cause of the blocking chain is the SPID on top that was on that IO_COMPLETION wait at the time the query snapshot was taken (see screen capture above). One of the stored procedures involved in this issue is writing data into a text file (orders). I cannot say that has something to do with the IO_COMPLETION.

I checked the metrics in Solarwinds DPA, and it showed a somewhat high signal wait percentage around the time this issue has occurred. The percentage around the same time was about 42%. Solarwinds has an info about this signal wait percentage that says “Anything over 20% would indicate that there is a possible CPU resource bottleneck.”

Fix

The spike in resource demand due to high usage of the application at the same time the database was writing data into the text files can certainly affect the CPU resources. Plus, the top SPID that caused all these blocking did not only incur most of the resources but also blocked a table that was also used by the other stored procedures down the chain, thus that explained the LCK_M_SCH_S & LCK_M_IS waits.

At that point, the action to take to fix the performance issue was becoming obvious. The SPID on top of the blocking chain didn’t seem to indicate that it would complete soon. There was only one command to make the application work again: KILL SPID.

Once the KILL command was issued against the top SPID, all the other dependent locks were released and the application was performing well again almost instantly after that.

But of course, you have to take extra precautions before you pull the plug on any task. If I kill the task, will it spawn another data issue? Will it cause other processes to fail? That’s your call.

KILL has saved the day again.