Thursday, April 28, 2011

When to use FltGetNewSystemBufferAddress

One of the really annoying operations that some file system filters need to deal with is the directory change notification. This comes in the form of an IRP_MJ_DIRECTORY_CONTROL with an IRP_MN_NOTIFY_CHANGE_DIRECTORY minor function. This operation allows a client of the file system (a user mode application for example) to be told when someone changes something about the contents of a directory. For more information on how this is exposed through the Win32 API, see the MSDN page on Obtaining Directory Change Notifications. In the file system this is implemented using a set of FsRtl routines (FsRtlNotifyFullChangeDirectory, FsRtlNotifyFullReportChange) and the FastFat source code in the WDK is a pretty good example on how a file systems uses these APIs.

Not all file system filters need to deal with this operation, the pleasure is typically reserved to namespace virtualization or content virtualization (which are filters that don't change the namespace but only what the contents of the objects in the namespace are, like encryption or compression filters) filters. What makes this complicated is that a filter might typically need to generate more events than the file system (if it manages virtual objects then it needs to send notifications for those) or to hide some events the file system generates (if the filter does some things "behind the scenes" which might trigger events that shouldn't been seen above the filter) or typically both. Windows itself uses these notifications (in explorer.exe) and so not handling them properly usually results in some pretty interesting behavior.

Now, in terms of implementation, the FsRtls have one particular interesting feature. Most notification mechanisms in the file systems are based on the same general pattern: whoever needs the notification sends an IRP into the FS, the FS pends the IRP and whenever it needs to signal that whatever thing the caller wanted to be notified about has happened it simply completes the IRP. This is the same story with oplocks and it is a pretty good mechanism in general. However, there is an interesting problem facing whoever is implementing this mechanism. What if the caller of this request needs some additional information about what has happened ? Well, they can just pass in a buffer in the IRP, right ? The problem here is that in order for the file system to be able to fill in that buffer it must have either a MDL for it or it must be in the right process context (also keep in mind that these requests are quite long-lived; they might be stuck in the file system for minutes or hours). However, MDLs and physical pages are precious resources (well, less so since having 4GB of RAM is pretty common these days but they used to be more precious back when this was designed) and being in the right process context is more complicated when returning STATUS_PENDING. So the FsRtl function in this case (where there isn't already a MDL and the buffer is not a SystemBuffer) will simply pend the request as is and when it needs to complete it will allocate a system buffer, put it into SystemBuffer and send it back up to the IO manager, which will know that it needs to use the data in SystemBuffer to copy it into the user's buffer and then free SystemBuffer (this is done in other places and as far as I can tell IRP_DEALLOCATE_BUFFER tells the IO manager to do this).

This is a particularly interesting problem for minifilters because as I've explained in the post about the COMPLETION_NODE, the parameters that are shown to the minifilter in the postOp callback are the same parameters that it has seen during the preOp callback. Those parameters didn't have anything in SystemBuffer, so how is the minifilter going to detect that this sort of thing has happened and how can it access that buffer ? This is where FltGetNewSystemBufferAddress() comes in handy. A minifilter can detect that this has happened by checking whether the FLTFL_CALLBACK_DATA_NEW_SYSTEM_BUFFER flag is set and if it is it can call the API to get a pointer to that buffer.

Let's take a look at this in the debugger (this is easy to repro, just set in the PassThrough sample a break in PtPostOperationPassThrough when FLTFL_CALLBACK_DATA_NEW_SYSTEM_BUFFER is set for an IRP_MJ_DIRECTORY_CONTROL function with an IRP_MN_NOTIFY_CHANGE_DIRECTORY minor function). Also remember that this is Win7 specific so you won't see your break trigger on anything before that :

1: kd> !fltkd.cbd @@(Data)

IRP_CTRL: 924482c8  DIRECTORY_CONTROL (12) [00180001] Irp PostOperation
Flags                    : [10000004] DontCopyParms FixedAlloc
Irp                      : 9244da98 
DeviceObject             : 92fa1c68 "\Device\HarddiskVolume2"
FileObject               : 92f45570 
CompletionNodeStack      : 92448380   Size=5  Next=1
SyncEvent                : (924482d8)
InitiatingInstance       : 00000000 
SwappedBufferMdl         : 00000000 
CallbackData             : (92448328)
 Flags                    : [00180001] Irp PostOperation +100000!!
 Thread                   : 943d5560 
 Iopb                     : 92448398 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0x00000000 
 IoStatus.Information     : 00000012 
 TagData                  : 00000000 
 FilterContext[0]         : 00000000 
 FilterContext[1]         : 00000000 
 FilterContext[2]         : 00000000 
 FilterContext[3]         : 00000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   924484a0
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   92448458
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   92448410
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [12,2]   00060000  00   0000   930a8978 92f45570 9605c6c8-00000000   924483c8
            ("FileInfo","FileInfo")  fileinfo!FIPostOperationCommonCallback 
     Args: 00000800 00000015 00000000 00000000 087a7830 0000000000000000
>[12,2]   00060000  00   0000   923cab40 92f45570 a0f51140-00000000   92448380
            ("PassThrough","PassThrough Instance")  PassThrough!PtPostOperationPassThrough 
     Args: 00000800 00000015 00000000 00000000 087a7830 0000000000000000
Working IOPB:
[12,2]   00060000  00          930a8978 92f45570                     92448354
            ("FileInfo","FileInfo")  
     Args: 00000800 00000015 00000000 00000000 087a7830 0000000000000000
1: kd> dt 9244da98 nt!_IRP AssociatedIrp.SystemBuffer
   +0x00c AssociatedIrp              : 
      +0x000 SystemBuffer               : 0xb1f87800 Void
1: kd> dt 92448380 fltmgr!_COMPLETION_NODE DataSnapshot.Parameters.DirectoryControl.NotifyDirectory.
   +0x018 DataSnapshot                                              : 
      +0x010 Parameters                                                : 
         +0x000 DirectoryControl                                          : 
            +0x000 NotifyDirectory                                           : 
               +0x000 Length                                                    : 0x800
               +0x004 CompletionFilter                                          : 0x15
               +0x008 Spare1                                                    : 0
               +0x00c Spare2                                                    : 0
               +0x010 DirectoryBuffer                                           : 0x087a7830 Void
               +0x014 MdlAddress                                                : (null) 
So as you can see by looking at this callback data, the DirectoryBuffer is a user mode address and there is no MDL so the minifilter has no way to get to the actual buffer being returned by the file system. Pre Win7 this would require that a minifilter that needs to look at the buffer in the post Op to allocate a MDL for the buffer in the preOp or to replace the buffer with their own buffer. Another thing to note is that the fltkd extension doesn't know about FLTFL_CALLBACK_DATA_NEW_SYSTEM_BUFFER and so it displays it by value (I've highlighted it).

Thursday, April 21, 2011

Filter Manager Concepts: Part 8 - COMPLETION_NODE

I'd like to talk a bit about how FltMgr handles IO completion and how FltMgr can guarantee that a minifilter will get the same parameters in the postOp callback as it did in the preOp.

It's probably pretty clear that FltMgr design shares a lot with IO manager design. In fact, in my opinion, it is a pretty transparent layer. It does not create many abstractions that do not map to IO manager concepts. This is pretty clearly visible in the design of the IO path. One of the FltMgr's main goals was to make better use of kernel stack space and a way to achieve this was to use a callback model as opposed to the IO manager's call-through model, which changes how IO is processed. However, because FltMgr is designed to be as asynchronous as the IO manager, it also makes use of a similar concept to an IRP, which is the FLT_CALLBACK_DATA structure.

Another big difference from the IO manager model is that minifilters can load at any time and at any point in the IO stack. This is pretty significant because when the IO manager allocates an IRP it knows which device it will be sent to and it know the depth of the IO stack and so it knows exactly how many IO_STACK_LOCATIONs that IRP might need. Any new devices will be attached on top of that device and they will not see the IRP. For FltMgr to implement the same model it would need to take a snapshot of the minifilters on the stack at the time each IO is allocated and then only show that IO to those minifilters, skipping any potential minifilters that would be added in between. But what if a minifilter allocates a FLT_CALLBACK_DATA structure to use for issuing IO under certain circumstances ? It is possible that the FLT_CALLBACK_DATA is quite long lived and whenever it will be used some minifilters will be bypassed. So FltMgr was designed in a different way to account for this. Every time a minifilter is done processing a request it returns control to FltMgr, which determines the next node to call at that time. This means that once a minifilter is loaded it might see IO that was initiated before the minifilter was present. It also means that FltMgr cannot know in advance the number of minifilters that will see any particular operation. So while it does keep a stack of structures that similar to the IO_STACK_LOCATION, the array cannot be fixed in size. Instead, FltMgr can reallocate the array on the fly if it discovers that it is about to run out of stack.

The COMPLETION_NODE structure is similar to the IO_STACK_LOCATION and it serves a similar purpose. It is used to know which minifilters to call during IO completion. By default there is an array of such structures that is allocated after the main IRP_CTRL structure header but as I explained above, if FltMgr runs out of entries while processing an operation, it can allocate a new array on the fly and use that one. So let's take a look in the debugger (broken in a preCreate callback for a slightly modified passthrough sample). I've tried to use different colors since I'm a guy and I hope it won't be too difficult to track which fields are related:

1: kd> kn
 # ChildEBP RetAddr  
00 9a7b98d4 96029aeb PassThrough!PtPreOperationPassThrough+0x5b [c:\temp1\passthrough\passthrough.c @ 672]
01 9a7b9940 9602c9f0 fltmgr!FltpPerformPreCallbacks+0x34d
02 9a7b9958 960401fe fltmgr!FltpPassThroughInternal+0x40
03 9a7b996c 960408b7 fltmgr!FltpCreateInternal+0x24
04 9a7b99b0 828744bc fltmgr!FltpCreate+0x2c9
05 9a7b99c8 82a786ad nt!IofCallDriver+0x63
06 9a7b9aa0 82a5926b nt!IopParseDevice+0xed7
07 9a7b9b1c 82a7f2d9 nt!ObpLookupObjectName+0x4fa
08 9a7b9b78 82a9acfa nt!ObOpenObjectByName+0x165
09 9a7b9d24 8287b44a nt!NtQueryAttributesFile+0x121
0a 9a7b9d24 774764f4 nt!KiFastCallEntry+0x12a
1: kd> !fltkd.cbd 0x9239ee40

IRP_CTRL: 9239ede0  CREATE (0) [00000009] Irp SystemBuffer
Flags                    : [1000000c] DontCopyParms Synchronize FixedAlloc
Irp                      : 94452248 
DeviceObject             : 92fa1c68 "\Device\HarddiskVolume2"
FileObject               : 92f93558 
CompletionNodeStack      : 9239ee98   Size=5  Next=1
SyncEvent                : (9239edf0)
InitiatingInstance       : 00000000 
Icc                      : 9a7b9984 
CreateIrp.NameCacheCtrl  : 9410ce18 
CreateIrp.SavedFsContext : 00000000 
CallbackData             : (9239ee40)
 Flags                    : [00000009] Irp SystemBuffer
 Thread                   : 942b6d48 
 Iopb                     : 9239ee6c 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0x00000000 
 IoStatus.Information     : 00000000 
 TagData                  : 00000000 
 FilterContext[0]         : 00000000 
 FilterContext[1]         : 00000000 
 FilterContext[2]         : 00000000 
 FilterContext[3]         : 00000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   9239efb8
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   9239ef70
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   9239ef28
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   9239eee0
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000884  00   0000   923a5678 92f93558 a0ede180-00000000   9239ee98
            ("PassThrough","PassThrough Instance")  PassThrough!PtPostOperationPassThrough 
     Args: 9a7b99ec 01200000 00070000 00000000 00000000 0000000000000000
Working IOPB:
>[0,0]    00000884  00          923a5678 92f93558                     9239ee6c
            ("PassThrough","PassThrough Instance")  
     Args: 9a7b99ec 01200000 00070000 00000000 00000000 0000000000000000

1: kd> dt 9239ede0 fltmgr!_IRP_CTRL
   +0x000 Type             : _FLT_TYPE
   +0x004 Flags            : 0x1000000c (No matching name)
   +0x008 MajorFunction    : 0 ''
   +0x009 Reserved0        : 0 ''
   +0x00a CompletionStackLength : 0x5 ''
   +0x00b NextCompletion   : 0x1 ''
   +0x00c CompletionStack  : 0x9239ee98 _COMPLETION_NODE
   +0x010 SyncEvent        : _KEVENT
   +0x020 Irp              : 0x94452248 _IRP
   +0x020 FsFilterData     : 0x94452248 _FS_FILTER_CALLBACK_DATA
   +0x024 AsyncCompletionRoutine : (null) 
   +0x028 AsyncCompletionContext : (null) 
   +0x02c InitiatingInstance : (null) 
   +0x030 PendingCallbackNode : 0x923a578c _CALLBACK_NODE
   +0x030 StartingCallbackNode : 0x923a578c _CALLBACK_NODE
   +0x034 preOp            : 
   +0x034 postOp           : 
   +0x038 PostCompletionRoutine : 0x96043f3c     void  fltmgr!FltDeletePushLock+0
   +0x03c DeviceObject     : 0x92fa1c68 _DEVICE_OBJECT
   +0x040 FileObject       : 0x92f93558 _FILE_OBJECT
   +0x044 FltWork          : _FLTP_WORKITEM
   +0x044 PendingCallbackContext : (null) 
   +0x048 CachedCompletionNode : (null) 
   +0x04c PendingStatus    : 0n0
   +0x058 CreateIrp        : 
   +0x058 CloseIrp         : 
   +0x060 Data             : _FLT_CALLBACK_DATA
   +0x08c WorkingParameters : _FLT_IO_PARAMETER_BLOCK

1: kd> ?? Data
struct _FLT_CALLBACK_DATA * 0x9239ee40
   +0x000 Flags            : 9
   +0x004 Thread           : 0x942b6d48 _KTHREAD
   +0x008 Iopb             : 0x9239ee6c _FLT_IO_PARAMETER_BLOCK
   +0x00c IoStatus         : _IO_STATUS_BLOCK
   +0x014 TagData          : (null) 
   +0x018 QueueLinks       : _LIST_ENTRY [ 0x0 - 0x0 ]
   +0x020 QueueContext     : [2] (null) 
   +0x018 FilterContext    : [4] (null) 
   +0x028 RequestorMode    : 1 ''

1: kd> dt 9239ee98 fltmgr!_COMPLETION_NODE
   +0x000 IrpCtrl          : 0x9239ede0 _IRP_CTRL
   +0x004 CallbackNode     : 0x923a578c _CALLBACK_NODE
   +0x004 Filter           : 0x923a578c _FLT_FILTER
   +0x008 InstanceLink     : _LIST_ENTRY [ 0x0 - 0x0 ]
   +0x010 InstanceTrackingList : (null) 
   +0x014 Context          : (null) 
   +0x018 DataSnapshot     : _FLT_IO_PARAMETER_BLOCK
   +0x044 Flags            : 0

1: kd> ?? sizeof(fltmgr!_IRP_CTRL) + 0x9239ede0  
unsigned int 0x9239ee98

1: kd> ??0x9239ede0+0x8c
unsigned int 0x9239ee6c

I've highlighted a couple of interesting things:
  • CompletionNodeStack points to exactly the end of the IRP_CTRL structure
  • the Data parameter for the preOp callback is actually a part of the IRP_CTRL structure
  • Data->Iopb is actually pointing to the WorkingParameters structure of the IRP_CTRL structure, just as !fltkd.cbd shows for "Working IOPB"
Now let's take a look at the same IRP_CTRL in the postCreate callback and see what's different. I've made one modification to the PassThrough sample, I'm actually sending something from preCreate to postCreate using the Context parameter.
1: kd> !fltkd.cbd @@(Data)

IRP_CTRL: 9239ede0  CREATE (0) [00080009] Irp PostOperation SystemBuffer
Flags                    : [1000000c] DontCopyParms Synchronize FixedAlloc
Irp                      : 94452248 
DeviceObject             : 92fa1c68 "\Device\HarddiskVolume2"
FileObject               : 92f93558 
CompletionNodeStack      : 9239ee98   Size=5  Next=1
SyncEvent                : (9239edf0)
InitiatingInstance       : 00000000 
SwappedBufferMdl         : 9a7b9984 
StartingCallbackNode     : ffffffff 
CreateIrp.NameCacheCtrl  : 9410ce18 
CreateIrp.SavedFsContext : 00000000 
CallbackData             : (9239ee40)
 Flags                    : [00080009] Irp PostOperation SystemBuffer
 Thread                   : 942b6d48 
 Iopb                     : 9239eeb0 
 RequestorMode            : [01] UserMode
 IoStatus.Status          : 0x00000000 
 IoStatus.Information     : 00000001 
 TagData                  : 00000000 
 FilterContext[0]         : 00000000 
 FilterContext[1]         : 00000000 
 FilterContext[2]         : 00000000 
 FilterContext[3]         : 00000000 

   Cmd     IrpFl   OpFl  CmpFl  Instance FileObjt Completion-Context  Node Adr
--------- -------- ----- -----  -------- -------- ------------------  --------
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   9239efb8
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   9239ef70
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000000  00   0000   00000000 00000000 00000000-00000000   9239ef28
     Args: 00000000 00000000 00000000 00000000 00000000 0000000000000000
 [0,0]    00000884  00   0000   930a8978 92f93558 96062d6c-49c0c357   9239eee0
            ("FileInfo","FileInfo")  fileinfo!FIPostCreateCallback 
     Args: 9a7b99ec 01200000 00070000 00000000 00000000 0000000000000000
>[0,0]    00000884  00   0000   923a5678 92f93558 a0ede180-a4e70d1c   9239ee98
            ("PassThrough","PassThrough Instance")  PassThrough!PtPostOperationPassThrough 
     Args: 9a7b99ec 01200000 00070000 00000000 00000000 0000000000000000
Working IOPB:
[0,0]    00000884  00          930a8978 92f93558                     9239ee6c
            ("FileInfo","FileInfo")  
     Args: 9a7b99ec 01200000 00070000 00000000 00000000 0000000000000000
 
1: kd> ?? Data
struct _FLT_CALLBACK_DATA * 0x9239ee40
   +0x000 Flags            : 0x80009
   +0x004 Thread           : 0x942b6d48 _KTHREAD
   +0x008 Iopb             : 0x9239eeb0 _FLT_IO_PARAMETER_BLOCK
   +0x00c IoStatus         : _IO_STATUS_BLOCK
   +0x014 TagData          : (null) 
   +0x018 QueueLinks       : _LIST_ENTRY [ 0x0 - 0x0 ]
   +0x020 QueueContext     : [2] (null) 
   +0x018 FilterContext    : [4] (null) 
   +0x028 RequestorMode    : 1 ''

1: kd> dt 9239ee98 fltmgr!_COMPLETION_NODE
   +0x000 IrpCtrl          : 0x9239ede0 _IRP_CTRL
   +0x004 CallbackNode     : 0x923a578c _CALLBACK_NODE
   +0x004 Filter           : 0x923a578c _FLT_FILTER
   +0x008 InstanceLink     : _LIST_ENTRY [ 0x944ffa04 - 0x9449eea0 ]
   +0x010 InstanceTrackingList : 0x944ffa00 _COMPLETION_NODE_TRACKING_LIST
   +0x014 Context          : 0xa4e70d1c Void
   +0x018 DataSnapshot     : _FLT_IO_PARAMETER_BLOCK
   +0x044 Flags            : 0

1: kd> ?? 0x9239ee98+0x18
unsigned int 0x9239eeb0
So now we can see how things changed from pre to post:
  • The most important change is that while Data is at the same location (since it is allocated in the IRP_CTRL) but Data->Iopb points to the DataSnapshot member of the COMPLETION_NODE structure.
  • !fltkd.cbd still shows WorkingIOPB as being 9239ee6c when in fact it should be 0x9239eeb0. So don't trust the WorkingIOPB in a postOp callback. However, the Iopb member under CallbackData is actually correct, pointing to the right Iopb.
  • WorkingIOPB looks like the IOPB for the COMPLETION_NODE for the fileinfo minifilter, the lowest one in this frame.
So let's go over how the whole thing works:

PreOP
  1. before calling a preOp callback fltmgr looks at whether the minfilter has a postOp callback registered and if so it initializes a COMPLETION_NODE structure on the stack and copies WorkingParameters into the IOPB member.
  2. IRP_CTRL->Data->Iopb is set to point to IRP_CTRL->WorkingParameters (this is the same address for all preOp callbacks)
  3. preOp callback is called
  4. if preOp callback doesn't want a postOp callback, then the COMPLETION_NODE on the stack is "released"
  5. any changes the minifilter made to the IOPB were made directly into WorkingParameters where they might be copied in the COMPLETION_NODE for the next minifilter
  6. if the minifilter wanted a postOp callback, FltMgr will add the COMPLETION_NODE to a list of COMPLETION_NODEs to be drained if the minifilter unloads.

PostOP
  1. before calling the postOp callback fltmgr changes the IRP_CTRL->Data->Iopb to point to the COMPLETION_NODE->DataSnapshot
  2. postOp is called
  3. after postOp returns the CompletionNodeStack is changed to point to the next (or previous depending on your perspective; the node for the first minifilter with a higher altitude that wanted a postOp callback) COMPLETION_NODE to be called.
Finally, there are a couple of things I'd like to mention because I've seen some minifilters doing funky stuff in the postOp callback:
  • FltMgr never looks at the IOPB to see if the filter changed something. It also doesn't look at the FLTFL_CALLBACK_DATA_DIRTY. The parameters are always saved before the preOp callback is called if the minifilter has a postOp callback. The next minifilter down gets exactly the same IOPB location (in the preOp path). This is not exactly the same as the IO_STACK_LOCATION mechanism.
  • In the postOp callback there is no point to changing the IOPB, it will be discarded and any change will be ignored. This also means that if a minifilter does change the IOPB it doesn't need to call FltSetCallbackDataDirty() in a postOp.

Finally, i'd like to warn you that all this is undocumented and there are no guarantees that FltMgr will still behave the same way in the future. I'm only showing how this works to help with debugging (and also because i just spent a couple hours trying to figure out why !fltkd.cbd doesn't give me the right WorkingIOPB :( ).

Thursday, April 14, 2011

An Extension for Tracking Stack Usage in Drivers

The kernel stack is a very limited resource and drivers need to be very careful about how they use that stack. File system filters in particular need to be very frugal, since they are in the IO path and any IO requests they make will use whatever stack remains. Then there is the issue of reentrancy which might not be such a serious issue for minifilters, but it still matters (there are still some cases where name provider callbacks are involved where stack can disappear quite quickly). Also, with the number of minifilters on the rise it is clear that stack usage is something file system filter developers must be extra careful about.

So as I was looking into reducing stack usage for one of my projects I (which I knew had some issues with stack usage) I realized that I didn't have a good way to know which functions to look at. So I finally started on a project I've been meaning to work on for a while but never had the time. The result of that is a debugger extension that looks at a set of functions (matching a pattern) and displays the size of the local variables for each function. Here are some examples :

0: kd> !ace.stack_usage passthrough!*
96 bytes used for locals by func: a5dee450 PassThrough!PtNormalizeNameComponentExCallback (...)
48 bytes used for locals by func: a5dee250 PassThrough!PtGenerateFileNameCallback (...)
20 bytes used for locals by func: a5dee620 PassThrough!_except_handler4 (...)
16 bytes used for locals by func: a5dee020 PassThrough!PtPreOperationPassThrough (...)
12 bytes used for locals by func: a5df2010 PassThrough!DriverEntry (...)
8 bytes used for locals by func: a5dee0d0 PassThrough!PtOperationStatusCallback (...)
8 bytes used for locals by func: a5dee1d0 PassThrough!PtDoRequestOperationStatus (...)
4 bytes used for locals by func: a5df1010 PassThrough!PtInstanceSetup (...)
4 bytes used for locals by func: a5df1150 PassThrough!PtInstanceTeardownComplete (...)
4 bytes used for locals by func: a5df10f0 PassThrough!PtInstanceTeardownStart (...)

Probably the biggest limitation is that it requires an x86 target machine which isn't a problem at the moment since most drivers have an x86 version anyway. I'm working on doing this for amd64 but it requires a pretty different implementation. One interesting aspect of it is that it also works with public symbols and so you can actually see stack usage for the top fltmgr functions for example:

0: kd> !ace.stack_usage fltmgr!* 20
524 bytes used for locals by func: 9626cda0 fltmgr!FltpvPrintErrors = 
404 bytes used for locals by func: 9625fff0 fltmgr!FltpCalculateLegacyFilterAltitude = 
384 bytes used for locals by func: 96262be2 fltmgr!FltpDoUnloadFilter = 
356 bytes used for locals by func: 96260ca0 fltmgr!FltpOpenInstancesRegistryKey = 
332 bytes used for locals by func: 96262ddc fltmgr!FltLoadFilter = 
300 bytes used for locals by func: 962672c0 fltmgr!FltGetDestinationFileNameInformation = 
292 bytes used for locals by func: 9624a9d0 fltmgr!DrainCompletionNode = 
280 bytes used for locals by func: 9625f558 fltmgr!FltpFsNotificationActual = 
184 bytes used for locals by func: 9625f3a6 fltmgr!FltpLogEvent = 
148 bytes used for locals by func: 9625c4b4 fltmgr!FltpGetVolumeFromName = 
144 bytes used for locals by func: 9624dc20 fltmgr!FltSendMessage = 
124 bytes used for locals by func: 96273506 fltmgr!FltpInitializeMessagingSupport = 
112 bytes used for locals by func: 96273878 fltmgr!FltpBuildParameterOffsetTable = 
100 bytes used for locals by func: 9626295a fltmgr!FltpOpenLinkOrRenameTarget = 
100 bytes used for locals by func: 9627316e fltmgr!DriverEntry = 
96 bytes used for locals by func: 96263694 fltmgr!FltpOpenClientPort = 
96 bytes used for locals by func: 9625d174 fltmgr!FltpEnumerateAggregateFilterInformation = 
92 bytes used for locals by func: 9626a522 fltmgr!FltGetVolumeGuidName = 
88 bytes used for locals by func: 9624d1fa fltmgr!FltReadFile = 
88 bytes used for locals by func: 9626d5a6 fltmgr!FltvPreOperation = 

You can download the file at https://sites.google.com/site/fsfilters/downloads/ACE1.0.zip?attredirects=0&d=1 and give it a go. I hope you'll find it useful, I've had a couple of surprises when running it on my drivers :)...

Thursday, April 7, 2011

Names in Minifilters - The Flags of FltGetFileNameInformation

While we're on the topic of names I'd like to address a question that comes up pretty regularly, the question of what do the various flags of FltGetFileNameInformation mean and why do they exist in the first place. The actual documentation can be found in the FltGetFileNameInformation MSDN page and flags can of course be added and their meaning might change in some ways so always check that page first.
There is a good discussion on each type of name on the msdn page for FLT_FILE_NAME_INFORMATION Structure.

FLT_FILE_NAME_SHORT


The description of a short name in the documentation is pretty good. It is important to note that this just the final component, not a full path. There is only one short name for a file. Hardlinks can only be long names. Also please note that any name might have a ~ in the name and it might be 8.3 compatible but it doesn't mean it is a short name. In fact, it is possible that a file is created that has a long name like "foo~1.txt" and a short name "foobar.txt". Going even further the short name can be set independently and so it doesn't even have to resemble the long name at all. So please don't make assumptions about how long and short names look like. The only thing you can assume is that if a name is longer than 8.3 then it is a long name. If it fits in 8.3 then it's impossible to tell just by inspecting it whether it's a long or short name. Other than calling this API (which is the easiest way anyway), one can get the short name for a file from the parent directory (see the documentation for ZwQueryDirectoryFile and some of the information classes like the FileBothDirectoryInformation) or from the file directly (see ZwQueryInformationFile and the FileAlternateNameInformation information class). Also, please note that not all file must have a short name. It is perfectly normal that there is no short name at all.
This is not a very popular name request, I don't think I've ever used it or seen a minifilter that uses this.

FLT_FILE_NAME_OPENED


This returns a path to the file. The name might contain short names for components in the path. In case there are multiple hardlinks and the file was opened by name, this name will be the appropriate one, using the appropriate directory entry and link (in other words if C:\foo\test1.txt and C:\bar\test2.txt are hardlinks to the same file and a FILE_OBJECT is opened for "C:\foo\test1.txt" then the opened name will never return "C:\bar\test2.txt" on that FILE_OBJECT). This is a very important point to remember for any file system or file system filter that implements something that looks like hardlinks (multiple names for the same file). For a more in-depth discussion on when opened names are useful see this post .

FLT_FILE_NAME_NORMALIZED


The normalized name is described quite well on the MSDN page I was talking about. One thing that is not mentioned in that page is that it also takes into account the hardlink the file was opened on (like the opened name) so two FILE_OBJECTS that are opened on the same underlying file (SCB) but through different hardlinks will have different normalized names. It very useful anywhere where  the file name influences the behavior of a filter (for example encrypt some files depending on file name or path; any name-based policy in general). Again, see my post on using names in filters.

FLT_FILE_NAME_QUERY_FILESYSTEM_ONLY


This tells FltMgr (and any name provider minifilter) to re-generate the name by querying the file system. It is meant to prevent any name cache associated with the file or any path component  from being used. Please note that this can fail because it is not always safe to query the name from the file system (see where STATUS_FLT_INVALID_NAME_REQUEST is returned). The important thing to note is that if it's not safe to query the name it doesn't mean that the requests FltMgr issues will fail but rather that the system might bugcheck or deadlock. So FltMgr has a set of checks it performs to see if it is safe to even attempt to build the name and if it is not it won't try to do anything and it will fail right away. If that check passes then FltMgr proceeds to build the name but the request might still fail because one of the operations FltMgr performs fails (not enough memory, the path does not exist and so on). This is particularly useful in debugging where a minifilter developer might want to make sure they always exercise the full code path. The performance impact of never going to the cache is pretty significant so it's unlikely to be useful in other case or in a production environment.

FLT_FILE_NAME_QUERY_CACHE_ONLY


This tells FltMgr to never query the file system and instead always go to the cache. If the name is not in the cache then FltMgr will not return anything. Possibly useful when the caller would like to get the name if it's cached but doesn't want to incur the cost of building it if it's not there. I can't think of a good use case because the name cache is a very dynamic thing and it gets purged when certain things happen and so it is impossible to rely on something being in the cache at any given time.

FLT_FILE_NAME_QUERY_ALWAYS_ALLOW_CACHE_LOOKUP


This is an the option that is the most similar to any regular cache. It will look the name up in the cache and if it finds it, it returns it. If it doesn't find it will check to make sure it's safe to get the name from the file system (and fail if it's not). If it is safe then it will proceed to build the name.

FLT_FILE_NAME_QUERY_DEFAULT


The difference between this query method and FLT_FILE_NAME_QUERY_ALWAYS_ALLOW_CACHE_LOOKUP is quite subtle and I see a lot of questions about it. In this case FltMgr will first perform the checks to see whether its safe to build the name from the file system before even looking at the cache. The reason it does this is that caches in general have a side effect of hiding problems in the data retrieval path (for name caches the data retrieval path is the name generation path). If the data retrieval path can fail in some cases then the cache might hide those failures by not exercising that path when data is cached. So FltMgr designers introduced this flag to help the developer know when they are asking for the file name in a path where they shouldn't because it's not safe to even try to get a name. So in terms of implementing debugging code, FLT_FILE_NAME_QUERY_FILESYSTEM_ONLY is the flag that has the biggest likelihood of failure (it fails if it's not safe to query the name and if it encounters any errors in the name generation path). Then comes FLT_FILE_NAME_QUERY_DEFAULT which will fail if it's not safe to query the name but if the name is in the cache it will return that (and so it won't fail in cases where errors would be encountered if trying to get the name from the file system). Finally FLT_FILE_NAME_QUERY_ALWAYS_ALLOW_CACHE_LOOKUP is the least likely to fail, since it will always return the name from the cache and only if the name isn't in the cache it might fail (if the check to see if it is safe to generate the name fails or if any other errors are encountered while trying to build the name). Personally I tend to always use FLT_FILE_NAME_QUERY_DEFAULT even in release builds but I've seen people that change it to FLT_FILE_NAME_QUERY_ALWAYS_ALLOW_CACHE_LOOKUP for release builds so to reduce somewhat the likelihood of failure.

FLT_FILE_NAME_REQUEST_FROM_CURRENT_PROVIDER


When developing proper name providers that implement different namespaces above and below their level, in some code paths the minifilter might need a name for "above" their level and in some cases the name for "below" their level. For example, if the minifilter wants to return a name to the user (for example when populating directory entries or when completing a call for FileNameInformation) then the minifilter must get the name for "above" their layer. On the other hand, when the minifilter wants to open a file on the file system or when it wants to get a directory entry for a certain file they need the name for below their layer. Figuring out when to get the "above" name and when to use the "below" name is pretty complicated and it really depends on the architecture of the minifilter, but even building the appropriate name can be very complicated. There is one place where a name provider must always return the above name, and that is in the name provider callbacks (since those are called specifically to build names for the layers above). So by specifying this flag a minifilter can simply call FltGetFileNameInformation(…,FLT_FILE_NAME_REQUEST_FROM_CURRENT_PROVIDER, …) when they need the "above" name and call FltGetFileNameInformation() without this flag to get the "below" name. This flag is the only way that I know of where a minifilter can tell FltMgr "send this request to filters below me AND to me".
Please note that when this flag is set all operations related to generating and normalizing a name will be sent TO this layer instead of the layers below and so when a name provider uses this flag to request the "above" name, they will see IRP_MJ_CREATE, directory queries and other operations. In particular, a name provider that calls FltGetFileNameInformation(…,FLT_FILE_NAME_NORMALIZED|FLT_FILE_NAME_REQUEST_FROM_CURRENT_PROVIDER, …) in preCreate can easily get in trouble if IRP_MJ_CREATE is called on the normalization path (which is pretty common) because they'll see that create and call FltGetFileNameInformation again which will issue another create which will go to the minifilter again and so on until the stack runs out…

FLT_FILE_NAME_DO_NOT_CACHE


In the case of a file system any IRP_MJ_CREATE request that reaches the layer must be resolved to a file or failed. However, for a filter things are more complicated. Virtualization filters for example might only virtualize a part of the namespace and so only some of the IRP_MJ_CREATE requests that they see belong to them. They must decide based solely on the information that is available to them at IRP_MJ_CREATE time whether they should step in and virtualize the file or not, which means the file name must be used for that. But as we already know getting the file name can be pretty complicated and it would be nice if the minifilter could call FltGetFileNameInformation(). Calling FltGetFileNameInformation() might send requests all the way to the file system with a name that might be wrong at the layers below the minifilter (if the minifilter is indeed supposed to virtualize the file).  The problem isn't so much that the wrong name might reach the wrong layers (since the operations involved in the name generation path are non-destructive) but rather that the wrong name might be cached at those layers. So in this case the name provider minifilter can call FltGetFileNameInformation() and specify FLT_FILE_NAME_DO_NOT_CACHE to make sure that the name (right or wrong) doesn't get cached as a result of that call.

FLT_FILE_NAME_ALLOW_QUERY_ON_REPARSE


This flag is needed because in postCreate a call to FltGetFileNameInformation() fails if the request wasn't successful (if the file wasn't opened). However, minifilters that rely on reparse points to know when to perform some action might be written in a way that they do nothing in preCreate and in postCreate if they get STATUS_REPARSE and the reparse point belongs to them then they know they must act. The problem is how to get the name in that case ? They could send the IRP_MJ_CREATE down again and specify FILE_OPEN_REPARSE_POINT and query the name from the file system directly, but this is pretty high overhead and they would need to normalize the name themselves. Alternatively they could always call FltGetFileNameInformation in preCreate and in postCreate only use it when they need to but this adds a lot of overhead to each IRP_MJ_CREATE. This is where this flag can be used. When the file system completes an operation with STATUS_REPARSE and returns a reparse tag the FILE_OBJECT->FileName is not modified and so the owner of that reparse tag can call FltGetFileNameInformation(…,FLT_FILE_NAME_ALLOW_QUERY_ON_REPARSE ,...) and FltMgr will generate the name based on FILE_OBJECT->FileName. As the documentation clearly states,  it is the caller's responsibility to ensure that the FileObject->FileName field was not changed. So this cannot be called in the general case, whenever a minifilter gets STATUS_REPARSE and they don't own the reparse tag.
Also, i'm not sure why this is listed as a name provider only flag, it seems to me it could be used by filters that are not name providers. Though I can't tell for sure if it would work or not so some experimentation will be necessary.