Should the FIM DPE Have a SQL to XPath Translator?

Nov 12, 2010 at 10:14 PM

Today the FIM DPE does have limited ability to translate some SQL to XPath. 

I'm pretty sure we should turf this function due to what I think will be high complexity and low value.  Especially given the insight from Craig's blog post:

In Planning Changes to Software, Take the Effects on the Processes for Maintaining the Software into Account


Our server accepts queries in the form of XPath expressions.  Transact-SQL queries are generated from the XPath expressions, and the Transact-SQL queries are executed to retrieve the data matching the query from our data store. 


An XPath parser produces an object model representation of the inbound query, and a SQL generator yields the executable Transact-SQL version.  In the process of figuring out the optimal schema for our data store and the best way of structuring the complicated Transact-SQL queries, we interposed an XML representation of the query between the XPath parser and the SQL generator.  So, at that point, the XPath parser produced the object model representation of the query, which was then represented as XML and passed to the SQL generator.  That arrangement was useful because a library of XML documents representing various queries could be created then used as input for testing various versions of the SQL generator in isolation.  Several iterations of the SQL generator were being produced each week at that point, as we evaluated different schema and various query strategies. 


Once we settled on a schema and a way of formulating our SQL queries that yielded satisfactory performance for all of our diverse query test cases, we considered whether to remove the step of producing an XML representation of a query before generating the Transact-SQL version  Yielding the Transact-SQL directly from the object model representation would be faster, because it would eliminate a step. 


Due the complexity of our queries, the time to execute them generally dwarfs the time to translate it from XPath into Transact-SQL, regardless of whether we express the query as XML in between.  On the other hand, having the XML to isolate the SQL generator from the rest of the server had become crucial to our process for maintaining the SQL generator.  Whenever a defect was found in our query feature, it was invariably due to the Transact-SQL not returning the correct results, or not executing quickly enough, or being syntactically erroneous.  We’d produce the XML representation of the query that caused the error once, and execute the SQL generator by itself, using that XML as input, and examine the Transact-SQL that got produced in SQL Server Management Studio.  Once the error in the Transact-SQL was identified and the SQL generator modified to correct it, the generator would be tested in isolation against our library of XML representations of queries, which was now expanded to include the one for the most recent failure case. 


Prior to having the XML version of a query as input to the SQL generator, when the SQL generator took to the object model representation of the query as input, debugging the SQL generation was decidedly more difficult.  It required installing and executing the service and attaching a debugger to that, then sending in the query from a client application, and using the debugger attached to the service to examine the representation of the query in memory and step through the SQL generation code.  Adding to the difficulty and duration of this process was the fact that our standard client does not allow a user to send an XPath query to our server directly.  Instead, it provides a graphical user interface for building a query that the client translates into XPath for transmission to the server.  So, because we didn’t think to maintain a client dedicated to sending arbitrary XPath expressions at our service for the sake of debugging queries, whenever we had to debug a query, one of us had to build a client for transmitting the XPath query to the service by scavenging the code for that purpose from the code for the standard client. 


Thus, the answer to the question about whether we should improve performance by removing the step of producing an XML representation of a query in the process of translating it from XPath into SQL was a firm “no.”   If the question was considered purely in terms of the software, then the answer might well have been “yes,” because the software would have been simplified, perhaps, by the omission of a step, and its performance improved, albeit just a little.  It is when the issue is considered in terms of the process by which the software is maintained that the correct answer is readily apparent.  In this case, it is hard to justify the step of representing the query as XML in terms of the design of the solution, and especially difficult to justify it with respect to the functional requirements.  Yet, it is crucial for the human process of servicing the query generator.