When you walk into a new project, it can be a challenge to build a mental image of what’s going on quickly. I could spend an hour just talking about how to get organized quickly. Instead of blathering on, here’s a script to query the system tables of the database to document it’s structure and then sample the actual data to help you see what’s actually in there.
The script will output the following columns: TableName, ColumnName, recordCnt, distinctCnt, nonNullCnt, NullCnt, fldSample. The general idea is to show you a list of the tables and columns followed by some counts of how many distinct values each table has along with how many records are null (or not). The real value is in seeing a sample of the values in each column (it’s in the fldSample column) . The script is set to limit the sample data to 200 characters, but that’s easy to change.
In any event, here’s the script:
USE [Datawarehouse]
GO
declare @tbl varchar(50)
declare @col varchar(50)
declare @sqlStmt varchar(700)
declare @val varchar(100)
declare @fldSample varchar(200)
set @val = ”
set @fldSample = ”
–Recreate the profile table
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[dbProfile]’) AND type in (N’U’))
DROP TABLE [dbo].[dbProfile]
CREATE TABLE [dbo].[dbProfile](
[TableName] [varchar](50) NOT NULL,
[ColumnName] [varchar](50) NOT NULL,
[recordCnt] [int] NULL,
[distinctCnt] [int] NULL,
[nonNullCnt] [int] NULL,
[NullCnt] [int] NULL,
[fldSample] varchar(300)
) ON [PRIMARY]
declare dbSchema cursor for
SELECT t.name AS table_name,
c.name AS column_name
FROM sys.tables AS t
INNER JOIN sys.columns c ON t.OBJECT_ID = c.OBJECT_ID
ORDER BY table_name
open dbSchema
fetch next from dbSchema into @tbl, @col
while @@FETCH_STATUS = 0
begin
set @sqlStmt = ‘declare fldList cursor for select distinct ‘ + @col + ‘ from ‘ + @tbl
EXEC (@sqlStmt)
set @fldSample = ”
open fldList
fetch next from fldList into @val
while @@FETCH_STATUS = 0
begin
set @fldSample = @fldSample + rtrim(ltrim(@val)) + ‘, ‘
fetch next from fldList into @val
end
close fldList
deallocate fldList
set @sqlStmt = ‘insert into dbProfile (TableName, ColumnName, recordCnt, distinctCnt, nonNullCnt, nullCnt, fldSample) ‘
set @sqlStmt = @sqlStmt +’select ”’+@tbl+”’ as TableName ,”’+@col+”’ as ColumnName, ‘
set @sqlStmt = @sqlStmt +’count(*) as recordCnt, ‘
set @sqlStmt = @sqlStmt +’count(distinct [‘+@col+’]) as distinctCnt, ‘
set @sqlStmt = @sqlStmt +’sum(case len(rtrim([‘+@col+’])) when 0 then 0 else 1 end) as nonNullCnt, ‘
set @sqlStmt = @sqlStmt +’sum(case len(rtrim([‘+@col+’])) when 0 then 1 else 0 end) as NullCnt, ‘
set @sqlStmt = @sqlStmt + ”” + left(@fldSample,200) + ””
set @sqlStmt = @sqlStmt +’ from ‘ + @tbl
–select @sqlStmt
begin try
EXEC (@sqlStmt)
end try
begin catch
print(@sqlStmt)
end catch
fetch next from dbSchema into @tbl, @col
end
close dbSchema
deallocate dbSchema
select * from dbProfile
Leave a Reply