The case of the corrupt images

I was recently working on an environment where there was a need to find find out corrupted image streams in a database in the form of BLOBs. You might say why not run a CHECKDB on the entire table and find out if there are any inconsistencies. Well if it were that simple, I probably wouldn’t have started penning this post! 🙂 Let me give you a bit of history here to set things in perspective…. The binary stream was inconsistent when the data was inserted into the database. So this was a clear case of “Garbage In Garbage Out”.

imageWhat made the problem take a convoluted turn was the use of this BLOB data in SQL Server Reporting Services reports. If you have reports which convert BLOB data into images before rendering them into reports, then the rendering would fail if the binary stream is inconsistent. When you have a few images in the table, it’s quite possible to manually identify the inconsistent BLOB data. Now when the number of images stored in the database moves to tens of millions, I am sure the DBA will find it very amusing to spend a years looking through the images manually! So what really shows up as the image when you have a corrupted binary stream in a SSRS report is just a Red “x”! (see left).

Since the manual option didn’t seem very palatable, I thought it might be time for some automation. And that is where C# came to my rescue. The logic was very simple. Pick an image of the database, store it on the filesystem in JPG/PNG/BMP format and then verify it’s validity. There are three ways that the image being rendered can have an issue:

1. There is no image in the BLOB column in the database i.e. the column value is NULL. This is quite easy to circumvent by using the IS NOT NULL filter or the ISNULL function.
2. The image column value is not null but it stores a 0KB image which will again create a problem during the report rendering phase. This still ain’t that bad a problem as we can search for images with binary data values as 0x00 or 0x0 with relative ease!
3. The third type of image is the main problem where there might be a few bit flips or part of the binary stream is inconsistent. This makes it exceptionally difficult to locate beforehand which images will fail to render while generating the report.

0x89504E470D0A1A0A0000000….
NULL
0x0

0x89504E47309AFD068163D86……

If you look at the binary streams above, the last three fall under categories described above. All the image streams in read will create a problem. The first and last streams look valid but the fourth stream is invalid even though if it has a valid header.

Using the code below, you could identify the three different failure scenarios by scanning all the images that are stored as BLOBs in your database. Though if you have the need to undertake this task, then it is advisable that you perform such a validity check during off business hours.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Drawing;
using System.Data;
using System.Data.SqlClient;

namespace ImageCheck
{

class Program
{
static void Main(string[] args)
{
// While loop to iterate through the database table
int ctr = 1;
while (ctr <= 24)
{
ReValidateImage(ctr);
ctr++;
}

// Wait for user input before exiting
Console.ReadKey();
}


public static void ReValidateImage(int SNo)
{

bool vImgCheck;
// SQL query to fetch the image
string sql = "SELECT SNo,[Document] from dbo.myTable WHERE SNo = " + SNo.ToString();

SqlConnection con = new SqlConnection("Server=.;Integrated Security=True;database=ReportDB");
SqlDataAdapter da = new SqlDataAdapter(sql, con);
SqlCommandBuilder MyCB = new SqlCommandBuilder(da);
DataSet ds = new DataSet("MyImages");
byte[] MyData = new byte[0];

// Fetch the image into a dataset
da.Fill(ds, "MyImages");
DataRow myRow;

myRow = ds.Tables["MyImages"].Rows[0];

// Store the image on the filesystem
try
{
if (myRow["document"] != null)
{
MyData = (byte[])myRow["document"];
int ArraySize = new int();
ArraySize = MyData.GetUpperBound(0);
// Handle ZERO kb image size
if (ArraySize > 0)
{
FileStream fs = new FileStream(@"C:\1.jpg", FileMode.OpenOrCreate, FileAccess.Write);
fs.Write(MyData, 0, ArraySize);
fs.Close();
// Check image saved on filesystem for validity
using (FileStream fsRead = new FileStream(@"C:\1.jpg", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
using (StreamReader sr = new StreamReader(fsRead))
{
vImgCheck = IsValidImage(fsRead);
if (vImgCheck == true)
Console.WriteLine("Image is OK for iteration " + SNo.ToString());
else
Console.WriteLine("Image is corrupt for iteration " + SNo.ToString());
}
}
}
else
{
// Report ZERO kb images
Console.WriteLine("Image size is ZERO for iteration " + SNo.ToString());
}
}
}
catch (SystemException ex)
{
// Handle NULLs extracted from the database
if (ex.GetType().ToString() == "System.InvalidCastException")
Console.WriteLine("Cannot export image for iteration " + SNo.ToString());
}
}
// Function to check validity of an image
public static bool IsValidImage(Stream stream)
{
try
{
try
{
// Try and check if the seek is possible. If it fails, then we have problems!
stream.Position = 0;
stream.Seek(0, SeekOrigin.Begin);
}
catch { }
using (Image img = Image.FromStream(stream))
{
// IF there are no errors, then report success
img.Dispose();
return true;
}
}
catch
{
return false;
}


}
}

A sample output of the above code when executed against a table which has two columns SNo (int) and Document (image) with SNo being the primary key would look like this:

Image is OK for iteration 1
Image size is ZERO for iteration 2
Cannot export image for iteration 3
Image is corrupt for iteration 4
Image is corrupt for iteration 5
Image is corrupt for iteration 6

So what can you do with the above code1? Firstly, you would need to modify the functions to align with the schema of your table. The second TO-DO item would be to write another function for storing the inconsistent binary stream row details in a text file or into a database table. You could even add an user interface if you want to go all out fancy!

Disclaimer: This Sample Code is provided for the purpose of illustration only and is not intended to be used in a production environment. THIS SAMPLE CODE AND ANY RELATED INFORMATION ARE PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR PURPOSE.

SQL Saturday #116: A first but definitely not the last

Last weekend we successfully concluded SQL Saturday #116 which was the first ever SQL Saturday organized in India. I had the opportunity to present at the event and it was definitely an experience that I enjoyed immensely! Right at the outset, I want to call out two aspects of the event:

  • The audience – The audience participation and enthusiasm was amazing without whom this event wouldn’t have been a success!
  • Our sponsors, volunteers and SQL PASS – It’s needless to say that the event without the support of our sponsors and volunteers wouldn’t have been possible!

The event had two tracks with even a SQL Server Clinic which was like an Ask-the-Expert session where you could bring the problems that you experience with SQL Server and discuss it with the SQL experts from CSS. This was a new addition which was the brainwave of my colleague, Tejas Shah [Twitter]. If you have attended the SQL Pass Summit, then you will be familiar with this concept.

I was presenting on Demystifying SQL Server Memory Dumps which I thoroughly enjoyed presenting! I have already uploaded my presentation to the SQL Saturday website. All the debugging demos shown are available under the debugging category tag on my blog.

Thank you everyone who attended in making this a fun event for us!!

Here are some other blog posts on the event from other speakers/attendee: Lohith [Speaker], Sudeepta [Attendee/Volunteer], Manas [Volunteer/Attendee], Varun Dhawan [Attendee] and Vinod Kumar [Speaker].

DSC00023

Technorati Tags: ,,

SQL Saturday #116: Are you gonna be there?

I am really looking forward to SQL Saturday 116 which is the first SQL Saturday being organized in Bangalore at the Microsoft Office on January 7th, 2012. There are some interesting stuff lined up for the day with two different tracks: Main and Practical. You can look up the schedule here.

Not only do you have an amazingly line-up of speakers but there’s also going to be a SQL Server Clinic. Those of you who have attended the SQL Pass Summit might be familiar with this concept where you have MSFT CSS Engineers having deep technical expertise in SQL Server available to you to discuss your real-world SQL problems that you face in your environments that you manage or work on. We love the challenge of new problems that we have not seen before and derive immense amount of satisfaction in helping someone resolve a tough or difficult issue involving SQL Server! If you have a specific problem to solve, it helps to bring the details. ERRORLOG files, error messages, specific query syntax, or details of your environment. The more background context you provide us with, the more we can narrow down with our suggestions. No presentations, just some serious troubleshooting and on-the-spot assistance from 2:30PM IST – 5PM IST under the Practical track.

I am glad that I will have the opportunity of presenting at SQL Saturday #116 along with the likes of Pinal Dave [Blog | Twitter], Vinod Kumar [Blog | Twitter] and Balmukund Lakhani [Blog | Twitter]. My presentation topic reads “Demsytifying SQL Server Memory Dumps” at 11:30AM IST. As the topic reads, I am going to clarify why digging into the root causes which result in SQL Server memory dumps might not be a daunting task!

I am totally looking forward to Jan 7th!! That would be a cracker of way to start off the first weekend on the New Year! Are you gonna be there?

Sysdatabases is empty–You say Whaaat?

While trying to backup a SQL Server database using SQL Writer (snapshot backups using VSS framework), then you might notice the following error in the application event logs.

Log Name: Application
Source: SQLWRITER
Event ID: 24581
Task Category: None
Level: Error
Description:
Sqllib error: System table sys.sysdatabases in SQL Server instance <SQL Server Name> is empty.

The above error is a bit misleading because it doesn’t literally mean that all your databases on the SQL Server instance reported have disappeared. If you scour the application event logs, you will find that another error is reported along with this error during the same timeframe.

Log Name: Application
Source: SQLWRITER
Event ID: 24583
Level: Error
Description:
Sqllib error: OLEDB Error encountered calling IDBInitialize::Initialize. hr = 0x80040e4d. SQLSTATE: 28000, Native Error: 18456
Error state: 1, Severity: 14
Source: Microsoft SQL Server Native Client 10.0
Error message: Login failed for user ‘NT AUTHORITY\SYSTEM’.
DBPROP_INIT_DATASOURCE: <SQL Server Name>
DBPROP_INIT_CATALOG: master
DBPROP_AUTH_INTEGRATED: SSPI

Now it is clear that the SYSTEM account is unable to log into the SQL Server instance mentioned in the error message above. If you look into the SQL Errorlog, you will find the following error message:

Error: 18456, Severity: 14, State: 11.
Login failed for user ‘NT AUTHORITY\SYSTEM’. Reason: Token-based server access validation failed with an infrastructure error. Check for previous errors. [CLIENT: <local machine>]

State 11 is basically telling you that the account doesn’t have access to the SQL Server instance. The reason for the login failure is available on SQL Server 2008 instances and above. There is a catch here the Login Failed error message is only reported in the SQL Errorlogs and Application Eventlogs only if the System account is granted access to the SQL Server instance but not granted the right permissions to get the database list from the sys.databases catalog. The hex code 0x80040e4d corresponds to an authentication failure.

So why does this happen? When you initiate a backup of a SQL Server database through SQLWriter (VSS Framework), SQLWriter will try to connect to all online instances to build a list of files associated with each database. SQLWriter/VSS needs this information to create the exclusion list. The account that the SQLWriter service uses is the NT AUTHORITY\SYSTEM account. So if you have multiple instances of SQL Server online on the same server, then the SYSTEM account needs to have SYSADMIN permissions OR the necessary permissions to run a query against the sys.databases catalog on the instance. This is a by-design requirement and documented in the KB Article mentioned below. However, if the SQL Server instance is not started, then you SQLWriter is not bothered about the instance(s) as I/O need not be frozen for database files for an instance which is shutdown.

919023    SQL Server 2005 connectivity and Volume Shadow Copy Service (VSS)
http://support.microsoft.com/default.aspx?scid=kb;EN-US;919023

Excerpt from above article:

Additionally, because of the types of operations that the writer must perform, we recommend that you do not remove the NT AUTHORITY\SYSTEM login from the sysadmin server role.

Now if you do not want to grant sysadmin permission to the System account, then you need to do the following to prevent the above error:

1. Grant db_datareader role to the SYSTEM account on the master database. This is required for other queries that the SQLWriter may need to executed during the course of the backup.
2. Grant ALTER ANY DATABASE or VIEW ANY DATABASE server-level permission, or CREATE DATABASE permission in the master database to the SYSTEM account. These are the minimum permissions required to query sys.databases table.

Normally the SYSTEM account is part of Built-in Administrators group on the box if it is not removed due to security hardening. You would need to grant the above permissions to the System account on all SQL Server instances that are installed and online on the server where you are trying to perform the VSS backup.

Another symptom of the problem manifests itself when you execute the command: vssadmin list writers. You will find that the above error message(s) are reported and the SQLWriter is not listed in the list of available writers.

One you have granted the SYSTEM account the necessary permissions, a quick way to verify that everything is working as expected is to run the vssadmin list writers command again. You might get the following output:

Writer name: ‘SqlServerWriter’
Writer Id: <GUID>
Writer Instance Id: <GUID>
State: [1] Stable
Last error: Non-retryable error

If you find that a non-retryable error is reported, then you need to restart the SQL Server VSS Writer service. Once this is done, run the command again and if there are no problems encountered, then you will get the following output for the SQLWriter.

Writer name: ‘SqlServerWriter’
Writer Id: <GUID>
Writer Instance Id: <GUID>
State: [1] Stable
Last error: No error

NOTE: If a SQL Server VSS (Snapshot) Backup is in progress for any instance on the server, then a restart of the SQL Server VSS Writer service should be deferred till the backup is completed or fails.

Easing in the trace FILTER for SQLDIAG

Now that the Diag Manager is available online and you can use configured custom PSSDIAG/SQLDIAG configuration files to collect data from SQL Server instances, you might want to configure your data collection packages in such a manner that profiler traces are setup correctly with trace filters if needed.

Note: Trace filtering can dramatically reduce trace (.TRC file size) and the I/O cost of tracing, but you should be aware that it can actually increase the CPU burden of tracing. To minimize the extra CPU use, filtering should be performed on an integer column (dbid, duration, etc.) instead of a text column (database name, textdata, etc) whenever possible. If a filter doesn’t remove a significant portion of the trace events (example >10%), it probably isn’t worth it, and might actually introduce more overhead than it prevents. While configuring PSSDIAG/SQLDIAG for SQL Server, you cannot add Profiler Trace Filters. Even if you do so from the GUI, it would not be included in the configuration file so that the filters are honored when  the server side trace starts up.

imageTo set filters for profiler traces collected with PSSDIAG/SQLDIAG, you need to:

1. Initialize PSSDIAG/SQLDIAG on the server. By this I mean start the PSSDIAG/SQLDIAG.
2. Find out the Trace ID of the profiler trace running using fn_trace_getinfo function or sys.traces view.
3. Use the Trace ID obtained from the above step, and use the sp_trace_setfilter stored procedure to set the filter. Refer "SQL Profiler Data Columns" under SQL Server Books Online for the Data Column numbers and "sp_trace_setfilter" topic for finding out the values of the logical and comparison operators.
4. To verify that the filter is active, use the fn_trace_filterinfo function. 

I shall demonstrate in this blog post how this works. As you can see in the see in the screenshot that the TraceID 2 is what I want to customize. Now that I have trace id, I will stop the profiler trace using sp_trace_setstatus stored procedure. Status value 0 will stop the trace.

Once I stop the trace, I use the sp_trace_setfilter function with the appropriate Data Column ID for setting a filter on SPID 52 to set a filter on SPID 52. Then I start the trace again. When I look into the profiler trace, I find that before the trace was stopped it was collecting data for all SPIDs but after setting the filter data is being collected ONLY for SPID 52 (see highlighted sections screenshot below).

imageOnce the filter is set. You can use the fn_trace_getfilterinfo function to verify that the trace filter is active.

If you are not using SQLDIAG to collect profiler traces, then it is as simple as configuring a server side trace. The only additional thing that you need to in this case is add the column filters while configuring the profiler trace. The above exercise is required only when you are collecting traces using SQLDIAG/PSSDIAG.

Reference:
How to: Filter Events in a Trace (SQL Server Profiler)
http://msdn.microsoft.com/en-us/library/ms175520.aspx

/*T-SQL commands used*/

select * from sys.traces — To get the trace id
EXEC sp_trace_setstatus 2,0 — Stops the trace but doesn’t delete the trace definition from the server
EXEC sp_trace_setfilter 2, 12, 0, 0, 52 — Add a filter for SPID = 52
EXEC sp_trace_setstatus 2,1 — Start the trace again
select * from fn_trace_getfilterinfo(2) — Get information about the filters set for the trace

image

Hope the above information helps in setting up trace filter for a profiler trace collected using SQLDIAG.