Excel via Linked Servers


Recently, I had replied to post on the #sqlhelp Twitter hashtag regarding configuring a Linked Server to an Excel file using the GUI in Management Studio. This is very much possible. I use a linked server to pull data from .xls file on a 64-bit SQL Server instance for an application that I maintain.

Using the 64-bit ACE provider, you can now do this. The data source which is masked in the above picture is the location of the Excel file with the full file path.

Once you have this configured, you can access the Linked Server catalogs by expanding the Linked Server in Object Explorer. Each table listed in the catalog is actually an Excel sheet.

This is fairly simple task but since this isn’t an explicit example out there for this, I thought I would do a quick post on the same.

Addedum: April 4th, 2010. After my colleague, Evan pointed out the server side support policy for ACE.

Disclaimer: The ACE redistributable link does mention the following:

The Access Database Engine 2010 Redistributable is not intended:

  • As a replacement for the Jet OLEDB Provider in server-side applications.
  • To be used within a service program or web application that relies on a Windows service.
Technorati Tags: ,,

SQL performance gotchas for Distributed Queries

I am not sure how many of us pay attention to BOL notes under certain topics. I must admit that I have overlooked quite a few useful notes once in a while. But while working on a Linked Server permissions issue, I came across this point under the “Security for Linked Servers”:

To create the best query plans when you are using a table on a linked server, the query processor must have data distribution statistics from the linked server. Users that have limited permissions on any columns of the table might not have sufficient permissions to obtain all the useful statistics, and might receive a less efficient query plan and experience poor performance. If the linked server is an instance of SQL Server, to obtain all available statistics, the user must own the table or be a member of the sysadmin fixed server role, the db_owner fixed database role, or the db_ddladmin fixed database role on the linked server.

The reason for this is that the query optimizer is dependant on the table/index statistics to decide on the most optimal plan for the query. For remote queries, that perform a large number of joins or have complex sub-queries or complex filter conditions, you are bound to run into permission issues if the linked server remote login doesn’t have the necessary permissions to extract the statistics information. If you are not willing to give the linked server login the above rights on the remote table, then it would be better to either use an alternative means to fetch the remote data or maintain a local copy of the data using mirroring/log shipping/replication and operate on the local data to make sure that your business logic doesn’t get affected as the data increases on your server.

IMHO linked server queries should never be used for implementing complex business logic! Remote queries should be used to fetch as minimal data as possible. But Utopia is not always a reality!