SQL performance gotchas for Distributed Queries

I am not sure how many of us pay attention to BOL notes under certain topics. I must admit that I have overlooked quite a few useful notes once in a while. But while working on a Linked Server permissions issue, I came across this point under the “Security for Linked Servers”:

To create the best query plans when you are using a table on a linked server, the query processor must have data distribution statistics from the linked server. Users that have limited permissions on any columns of the table might not have sufficient permissions to obtain all the useful statistics, and might receive a less efficient query plan and experience poor performance. If the linked server is an instance of SQL Server, to obtain all available statistics, the user must own the table or be a member of the sysadmin fixed server role, the db_owner fixed database role, or the db_ddladmin fixed database role on the linked server.

The reason for this is that the query optimizer is dependant on the table/index statistics to decide on the most optimal plan for the query. For remote queries, that perform a large number of joins or have complex sub-queries or complex filter conditions, you are bound to run into permission issues if the linked server remote login doesn’t have the necessary permissions to extract the statistics information. If you are not willing to give the linked server login the above rights on the remote table, then it would be better to either use an alternative means to fetch the remote data or maintain a local copy of the data using mirroring/log shipping/replication and operate on the local data to make sure that your business logic doesn’t get affected as the data increases on your server.

IMHO linked server queries should never be used for implementing complex business logic! Remote queries should be used to fetch as minimal data as possible. But Utopia is not always a reality!

Advertisement