Killing Sessions with External Wait Types

This afternoon I was asked a question that comes up once or twice a year by email. I answered as usual, and only later realized that the topic was a reasonably good match for this month’s TSQL Tuesday topic. I thought I might share the question, and the answer, with a larger audience this time.

Every once in a while I’ll receive a question about a monitoring batch whose spid has gotten hung, and attempts to kill the session simply leave the spid in a rollback state for hours or days on end. Almost invariably these sessions are presenting with an external wait type – usually either MSQL_XP or one of the PREEMPTIVE_OS_* wait types. The usual suspects are calls to xp_fixeddrives and other extended stored procedures backed by xpstar.dll and the sp_oa* procedures that support OLE Automation.

Here is an example of a query which will cause the problem. (Note that enablement of xp_cmdshell is a security hazard and thus this code should only be attempted on a test machine)

exec xp_cmdshell 'powershell -command "sleep 100"'

When executed, this command will execute for exactly 100 seconds. Executing the following code will yield some information about what the session is doing.

from sys.dm_exec_requests
where session_id = <session_id>
session_id wait_type command

If the session running xp_cmdshell is killed, the session will move to a killed/rollback state, but retain the same wait type.

session_id wait_type command

It will remain in this state until the 100 seconds elapse.

Now, this is all well and good for a session that will die after 100 seconds, but what about a session which is truly and intractably hung? Or what if it was a 10,000 second wait? Is there anything to be done?

Well, it depends. In this specific case, yes. The xp_cmdshell extended stored procedure is running its command out of process to SQL Server. In fact, you can see the life of the session using Sysinternals ProcMon. Here I’ve reduced the wait to 1 second just to capture it in a neat window.

procmon sleep command

Click to enlarge

SQLServer creates the cmd.exe process and passes it the command, which in turn calls powershell. Each of these processes is a full fledged process visible in Process Explorer.

click to enlarge

So in this particular case, if the sleep command was for 10,000 seconds, after the session kill failed, it would be possible to end the session by killing powershell.exe using Process Explorer – and that does indeed work.

However, there are other external waits which cannot be killed in this way, and these are generally a bit more gnarly to work with. For example, on my machine, it takes about 7 second for the following command to run:

xp_dirtree 'c:\windows'

And if I watch it in procmon, I can see that it is SQL Server’s own process which is walking the directory tree.

procmon xpdirtree command

click to enlarge

Now, this command is having no trouble, and I can kill it quite easily. But as you can see, there’s no process other than SQL Server itself involved.

What I usually see is that either xp_fixeddrives, xp_readerrorlogs, xp_enumerrorlogs, or one of the OLE Automation stored procedures gets snagged up while running in process to SQL Server. There’s nothing to kill other than SQL Server itself in this case, and the process generally has to remain a zombie until the underlying problem is solved and/or the SQL Server service is cycled.

In the case of OLE Automation stored procedures specifically, it is possible to run these commands out-of-process to SQL Server just as in the example with xp_cmdshell above. Sometimes this can be done simply by setting the context flag of sp_OACreate. In other cases, it may be necessary to fiddle with registry settings to change the threading model of the assembly being loaded from “Apartment” to “Both” – something which, it goes without saying, is best tried in a test environment first.

So, that’s my experience with stuck spids and external wait types. If you’re lucky, when you get stuck with one of these, it’s an out of process wait. If you’re not… well, you may be waiting for a while longer.

December 11, 2013 Posted by: 6 Comments - Permalink
6 Responses to Killing Sessions with External Wait Types
  1. Thanks for joining in on T-SQL Tuesday #49! I’m glad someone dealt with the external wait types.

  2. […] Good friend and former colleague Vicky Harp (blog|@vickyharp) shared some tactics for dealing with hung sessions from the rarely discussed external OS wait type in Killing Sessions with External Wait Types. […]

  3. Jeff Moden says:

    Form the blog article: “(Note that enablement of xp_cmdshell is a security hazard and thus this code should only be attempted on a test machine)”

    Gosh. There’s a whole lot of people that say that and believe that but enabling xp_CmdShell is NOT a security problem. Bad Security is the only problem. Stop and think about it. Who can enable xp_CmdShell? Only those that have “SA” privs (or Control Server privs but we’ll stick to “SA” to make it easier to talk about.). Now, who are the only people that can use xp_CmdShell? The answer is, unless you’ve made the terrible mistake of giving an idividual low priv’ed login/user privs to run it directly, only “SA”s can use it.

    Now, ask yourself, if an attacker (internal or external) breaks into your system as “SA”, is there anything in SQL Server that will prevent the attacker from enabling and using xp_CmdShell? Even if there was, you’re totally screwed but the answer is “No, there is nothing to prevent it.” In fact, most hacker software is setup to check if it is enabled and automatically enable it if it’s not. There’s absolutely no layer of protection if someone gets into your server as “SA”.

    What about the DBAs using it? If you don’t want them to use xp_CmdShell to elevate their privs, then you need to do one of two things. Either make it so the SQL Server login can’t “see” more than you’d want any of the DBAs to see or you need to throttle back the privs of the DBAs themselves. After all, “DBA” stands for “DataBase Administrator” and not “System Administrator”. It IS possible to give DBAs the necessary privs to do their jobs WITHOUT them having “SA” or “Control Server” privs.

    Like I said, xp_CmdShell is NOT a security issue. Whether you use xp_CmdShell or not, bad security is the only security issue.

    –Jeff Moden

    • Sean Nguyen says:

      I totally agree with Jeff’s comment above. It kills me every time i see the warning text from MS about how big of a security risk it is to enable “xp_cmdshell”. :)

  4. Manish says:

    Really a good article every DBA to read, who face this kind of issue frequently. Had I read this earlier, I would have been able to resolve many issues. But still not late.

  5. Kris Hokanson says:

    Ran into precisely this issue today at a client. Started happening right after installing the Oracle client so we could use it for an Oracle linked server. Linked server queries would hang and could not be killed. Even connecting through SSMS and trying to enumerate all server objects would hang. Ended up having to restart the server and things seem to have settled down. Oh well.

Leave a Reply

%d bloggers like this: