Tag Archives: #tsql2sday

Wait for file before processing

As part of a recent project, I needed to check if a file existed before starting an SSIS package to import that file.  The catch was that I did not know what time time file was going to be placed on the file system.  If the SSIS package runs and the file does not exist yet, the package will fail.

You can create a script task component within SSIS to check for a file and then sleep, however several sources said that this could spike the processor so I deceded to go a different route.  To solve the problem, I wrote a quick little powershell script to check if the file exists and then wait in a loop before starting the SSIS package.

I created a SQL agent job with 2 steps.  The job runs daily at 1:00 AM. The job has a status of Executing on Step 1 until the file exists.

Step 1: powershell script to check for the file (see below)

Step 2: Execute SSIS package task.

The file that I am looking for in this example is: C:\test\ImportFolder\fileToCheck_20130212.txt
You will notice that today’s date is also appended to the end of the file name.  I built the file name out dynamically along with the current date stamp.

$dt = Get-Date -format yyyyMMdd
$path = 'C:\test\ImporFolder\'
$theFile = $path + 'fileToCheck_' + $dt +'.txt'

#  $theFile variable will contain:  C:\test\ImportFolder\fileToCheck_20130212.txt
While (1 -eq 1) {
	IF (Test-Path $theFile) {
		#file exists. break loop
		break
	}
	#sleep for 60 seconds, then check again
	Start-Sleep -s 60
}

This post is part of the blogging phenomenon known as TSQL Tuesday. This month’s blog party is hosted by Wayne Sheffield, who is writing a series of blog posts in the month of February all about powershell..

SQL Tuesday #028 – Jack of All Trades, Master of None

This month’s TSQL Tuesday is hosted by Argenis Fernandez.  This month’s topic:  “blog about your experience. Tell us why you specialized, or why you’d like to specialize. If you don’t think that specialization is a good thing, tell us why. Discuss. Argue your point(s).”

=================================================

My first job out of college was a Network Administrator for a small non-profit organization.  As the only IT guy in the organization, I was responsible for a lot…managing servers, planning upgrades, scheduling downtime, email setup/support, network switches, cabling, wireless network access/security, managing data, backing up data, help desk support, desktop pc support, web site design, managing the phone system, building overhead paging, video surveillance system, report design and generation, unjamming printers and copiers, evaluating new applications, and developing custom applications.  The organization has classrooms spread out across the entire county.  Each classroom had various technologies from a PC, to phones, answering machines, and speaker systems.

I definitely could not specialize in any particular area in this position.  On any given day I could have needed to travel 60 miles round trip to fix a problem in one of the outlying classrooms and return to my office prepare data to submit for a federally mandated report.  When people asked me what I did, I would tell them my job was to fix anything that plugged into a wall.  With a non-existent budget, I had to get creative to make things work in the organization.

Through a series of life choices, I moved on to a new company in a new position that focused more on systems and database administration.  This position is definitely a lot more specialized than where I started.  There are now other support groups that I can refer people to for issues with PC’s, Exchange, networking, phones, etc.

I still think that it is important to keep up on a lot of the basics, especially with SQL server.  People like to blame the database as the problem when often times the solution is not nearly that simple.  Many problems that I encounter on a day-to-day basis are rooted in specialties not directly related to SQL server.  I may not be a specialist in networking, pc repair, or Active Directory administration, but it has been very beneficial to me to have a good working knowledge of these concepts.  There are other people responsible for fixing these things at my current company.  I can usually figure out what the problems is  and direct to the correct support groups fairly quickly with my generalist background.

T_SQL Tuesday #25: Sharing Tricks

The topic for this month’s TSQL Tuesday, hosted by Allen White, is an invitation to share your tricks.  Before I get into my trick to share, I wanted to mention that Allen is a pretty awesome speaker too.  I got to see his session “Gather SQL Server Performance Data with Powershell” at the SQL Saturday in Columbus earlier this year.  Allen is really excited about SQL Server and the cool things you can do with Powershell.

Now for my trick, I have a view that was created to help write more dynamic rolling sql queries and reports in my organization.  I’m not sure who the original author of the script is, as it has been passed around and modified several times.  I thought I would share it here in the hopes that it helps someone else someday.  If you’re the original author or know who is, please let me know so I can give you due credit.

This view defines a bunch of different date parameters compared to the current date, including:

  • TODAY_BEGIN
  • TODAY_END
  • YESTERDAY_BEGIN
  • YESTERDAY_END
  • DAY_BEFORE_YESTERDAY_BEGIN
  • DAY_BEFORE_YESTERDAY_END
  • SUNDAY_WEEK_BEGIN
  • SUNDAY_WEEK_END
  • MONDAY_WEEK_BEGIN
  • MONDAY_WEEK_END
  • PREVIOUS_SUNDAY_WEEK_BEGIN
  • PREVIOUS_SUNDAY_WEEK_END
  • PREVIOUS_MONDAY_WEEK_BEGIN
  • PREVIOUS_MONDAY_WEEK_END
  • MONTH_BEGIN
  • MONTH_END
  • YESTERDAYS_MONTH_BEGIN
  • YESTERDAYS_MONTH_END
  • PREVIOUS_MONTH_BEGIN
  • PREVIOUS_MONTH_END
  • SECOND_PREVIOUS_MONTH_BEGIN
  • SECOND_PREVIOUS_MONTH_END
  • THIRD_PREVIOUS_MONTH_BEGIN
  • THIRD_PREVIOUS_MONTH_END
  • FOURTH_PREVIOUS_MONTH_BEGIN
  • FOURTH_PREVIOUS_MONTH_END
  • TWELTH_PREVIOUS_MONTH_BEGIN
  • TWELTH_PREVIOUS_MONTH_END
  • PREVIOUS_SIXTH_MONDAY_WEEK_BEGIN
  • PREVIOUS_SIXTH_MONDAY_WEEK_END
  • PREVIOUS_SIXTH_SUNDAY_WEEK_BEGIN
  • PREVIOUS_SIXTH_SUNDAY_WEEK_END
  • NEXT_MONTH_BEGIN
  • NEXT_MONTH_END

You can take a quick look at the result returned from this view.  After creating the view (script included below), run:

SELECT  *
FROM    vw_date_ranges

With these columns defined, you can easily query a database table looking for rows based off of a date by cross joining this view and the adding the date columns to the where clause.  For example, if you want to see all orders for “This Week”, you can run a query similar to:

SELECT SalesOrderID,
       OrderDate,
       SalesOrderNumber,
       PurchaseOrderNumber,
       CustomerID,
       TotalDue
FROM   Sales.SalesOrderHeader
       CROSS JOIN vw_date_ranges
WHERE  Sales.SalesOrderHeader.OrderDate > vw_date_ranges.SUNDAY_WEEK_BEGIN

Or if you want to see all orders for “Last Month”, you can run a query similar to:

SELECT SalesOrderID,
       OrderDate,
       SalesOrderNumber,
       PurchaseOrderNumber,
       CustomerID,
       TotalDue
FROM   Sales.SalesOrderHeader
       CROSS JOIN vw_date_ranges
WHERE  Sales.SalesOrderHeader.OrderDate >= vw_date_ranges.PREVIOUS_MONTH_BEGIN
       AND Sales.SalesOrderHeader.OrderDate < vw_date_ranges.MONTH_BEGIN

Using this method helps me keep my sql query clean, and provides a rolling date range on queries and reports.

Download the script here


SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE VIEW [dbo].[vw_date_ranges]
(TODAY_BEGIN, TODAY_END, YESTERDAY_BEGIN, YESTERDAY_END, DAY_BEFORE_YESTERDAY_BEGIN,
DAY_BEFORE_YESTERDAY_END, SUNDAY_WEEK_BEGIN, SUNDAY_WEEK_END, MONDAY_WEEK_BEGIN, MONDAY_WEEK_END,
PREVIOUS_SUNDAY_WEEK_BEGIN, PREVIOUS_SUNDAY_WEEK_END, PREVIOUS_MONDAY_WEEK_BEGIN, PREVIOUS_MONDAY_WEEK_END, MONTH_BEGIN,
MONTH_END, YESTERDAYS_MONTH_BEGIN, YESTERDAYS_MONTH_END, PREVIOUS_MONTH_BEGIN, PREVIOUS_MONTH_END,
SECOND_PREVIOUS_MONTH_BEGIN, SECOND_PREVIOUS_MONTH_END, THIRD_PREVIOUS_MONTH_BEGIN, THIRD_PREVIOUS_MONTH_END, FOURTH_PREVIOUS_MONTH_BEGIN,
FOURTH_PREVIOUS_MONTH_END, TWELTH_PREVIOUS_MONTH_BEGIN, TWELTH_PREVIOUS_MONTH_END, PREVIOUS_SIXTH_MONDAY_WEEK_BEGIN, PREVIOUS_SIXTH_MONDAY_WEEK_END,
PREVIOUS_SIXTH_SUNDAY_WEEK_BEGIN, PREVIOUS_SIXTH_SUNDAY_WEEK_END,NEXT_MONTH_BEGIN,NEXT_MONTH_END)
AS
select DATEADD(dd, DATEDIFF(dd,0,getdate()), 0)                                TODAY_BEGIN,
dateadd(ms,-3,DATEADD(dd, DATEDIFF(dd,0,getdate()+ 1 ), 0)) - .000011574        TODAY_END,
DATEADD(dd, DATEDIFF(dd,0,getdate()), -1)                                       YESTERDAY_BEGIN,
dateadd(ms,-3,DATEADD(dd, DATEDIFF(dd,0,getdate()  ), 0)) - .000011574          YESTERDAY_END,
DATEADD(dd, DATEDIFF(dd,0,getdate()), -2)                                       DAY_BEFORE_YESTERDAY_BEGIN,
dateadd(ms,-3,DATEADD(dd, DATEDIFF(dd,0,getdate()  ), -1)) - .000011574         DAY_BEFORE_YESTERDAY_END,
-- Relative Dates - Weeks
DATEADD(wk, DATEDIFF(wk,0,getdate()), -1)                        SUNDAY_WEEK_BEGIN,
DATEADD(wk, DATEDIFF(wk,0,getdate()), -1) + 6.999988426          SUNDAY_WEEK_END,
DATEADD(wk, DATEDIFF(wk,0,getdate()), 0)                         MONDAY_WEEK_BEGIN,
DATEADD(wk, DATEDIFF(wk,0,getdate()), 0)  + 7.999988426          MONDAY_WEEK_END,
DATEADD(wk, DATEDIFF(wk,0,getdate()), -8)                        PREVIOUS_SUNDAY_WEEK_BEGIN,
DATEADD(wk, DATEDIFF(wk,0,getdate()), -8) + 6.999988426          PREVIOUS_SUNDAY_WEEK_END,
DATEADD(wk, DATEDIFF(wk,0,getdate()), -7)                        PREVIOUS_MONDAY_WEEK_BEGIN,
DATEADD(wk, DATEDIFF(wk,0,getdate()), -7) + 7.999988426          PREVIOUS_MONDAY_WEEK_END,
-- Relative Dates - Months
dateadd(mm,datediff(mm,0,getdate()),0)                                                    MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(m,0,getdate()  )+1, 0)) - .000011574                   MONTH_END,
dateadd(mm,datediff(mm,0,getdate() - 1),0)                                                YESTERDAYS_MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(m,0,getdate() - 1 )+1, 0)) - .000011574                YESTERDAYS_MONTH_END,
DATEADD(mm, DATEDIFF(mm,0,getdate()  )-1, 0)                                              PREVIOUS_MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(mm,0,getdate()  ), 0)) - .000011574                    PREVIOUS_MONTH_END,
DATEADD(mm, DATEDIFF(mm,0,getdate()  )-2, 0)                                              SECOND_PREVIOUS_MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(mm,0,getdate()  )-1, 0)) - .000011574                  SECOND_PREVIOUS_MONTH_END,
DATEADD(mm, DATEDIFF(mm,0,getdate()  )-3, 0)                                              THIRD_PREVIOUS_MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(mm,0,getdate()  )-2, 0)) - .000011574                  THIRD_PREVIOUS_MONTH_END,
DATEADD(mm, DATEDIFF(mm,0,getdate()  )-4, 0)                                              FOURTH_PREVIOUS_MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(mm,0,getdate()  )-3, 0)) - .000011574                  FOURTH_PREVIOUS_MONTH_END,
DATEADD(mm, DATEDIFF(mm,0,getdate()  )-12, 0)                                              TWELTH_PREVIOUS_MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(mm,0,getdate()  )-11, 0)) - .000011574                  TWELTH_PREVIOUS_MONTH_END,
-- Added 2/23/09 These will be used in a claim audit report that looks at data from 6 weeks prior.
DATEADD(wk, DATEDIFF(wk,0,getdate()  )-6, 0)  as   PREVIOUS_SIXTH_MONDAY_WEEK_BEGIN,
DATEADD(wk, DATEDIFF(wk,0,getdate()  )-6, 0) + 7.999988426  as PREVIOUS_SIXTH_MONDAY_WEEK_END,
DATEADD(wk, DATEDIFF(wk,0,getdate()  )-6, -1) as   PREVIOUS_SIXTH_SUNDAY_WEEK_BEGIN,
DATEADD(wk, DATEDIFF(wk,0,getdate()  )-6, -1) + 6.999988426  as PREVIOUS_SIXTH_SUNDAY_WEEK_END,
DATEADD(mm, DATEDIFF(mm,0,getdate()  )+1, 0)                                              NEXT_MONTH_BEGIN,
dateadd(ms,-3,DATEADD(mm, DATEDIFF(mm,0,getdate()  )+2, 0)) + .000011574                    NEXT_MONTH_END

T-SQL Tuesday #23: Fixing Joined Views

This month’s TSQL Tuesday topic, hosted by Stuart Ainsworth, is Joins.  I am going to share a story of a performance improvement that I made with joined views.

One of the main vendor-built applications that I support has views built on top of tables.  Unfortunately that’s not the end of the story.  These views are built on top of views, which join together other views, which link back to the database tables.  In all the training materials provided by the vendor, they say to *always* use the views when writing a report on the data and never directly query the table.

Using the vendor provided views generally works out OK and performs reasonably well.  One particular report I wrote kept bugging me due to how long it took to run (several minutes each time). compared to how much data was actually returned.  So I started looking at the execution times and the query plans.  Note, I took the screenshots below using the fantastic free tool from SQL Sentry, SQL Sentry Plan Explorer.

This report needed to join 5 tables to get the data that I needed.  Using the vendor provided views, here is the join diagram I started out with:

You can see that the vendor views are joining together a lot more hidden tables (table & field names blurred to protect the innocent) than the 5 I actually need.  The nested views are even hitting the same tables more than once.

Here is the original query plan:

You can’t see it in the screen shot, but one of the thick lines in the middle is representing 23 million rows!

OK, time to pull out the detective hat.  I decided to rewrite the query using only the base tables.  I had to do a bit of extra work with things like UTC datetime vs. local time which the views converted.  After the rewrite, the join diagram looked like this:

Here is the final query plan:

The highest number of rows coming through is about 11,000.  A far cry from the 23 million rows in the original query! The execution plan also looks a lot leaner than what I started with as well.

The original query was running with an average CPU time = 26062 ms, and an average elapsed time = 26424 ms.
My rewritten query is now running with an average CPU time = 0 ms, and an average elapsed time = 266 ms.

Looking at the actual execution plan, SSMS is prompting me that there is a missing index for my new query, and there is still a Clustered Index Scan that accounts for 87% of the query.   I may look into that more at a later date, but for now I am very happy running a query in a couple of seconds that used to take several minutes. I don’t think it’s really worth the extra effort to try to shave another 100 ms off of a query that completes in under 300 ms.

The vendor supplied nested views were each joining together multiple views and tables, which was causing a lot of extra and unnecessary bloat in my query.  An hour of work and cutting out all the bloat made a huge difference in this particular report’s run time.

And the users rejoice at how fast their report now runs!

T-SQL Tuesday #22 Data Presentation

This month’s TSQL Tuesday topic: Data Presentation.

I am not going to show any code this month, and I am also going to veer off the topic of SQL server slightly.

My company does a lot of reporting though Cognos.  The nice thing about Cognos is that it is platform independent.  Once a connection is made to an Oracle, MSSQL, or even MySQL database, the presentation looks the same to a report developer.  The end user goes to the URL for cognos and runs their report.  They may not even know what application or technical backend is supplying the data.

We have LOTS of reports that get run across the enterprise.  We also probably have a couple hundred report developers that write reports with varying degrees of frequency.  I’m sure you can imagine that without any standards no two reports would look similar and end users would not know how to use or interpret the information that they get out of a report.  Hence why data presentation is so important.

There are several standards that have to be met in our environment in order for a report to be published for end users.  While I don’t think it would be appropriate for me to show you any screenshots, I can summarize some of the general rules here.

  1. All reports must use the corporate header and footer.  Information included must contain the business unit for the report, Title of the report, corporate logo, confidentiality statement, page numbers, and telling how to best view the report (ie: html, pdf, Excel, etc.)
  2. The last page of the report must be the standard documentation page.  This page has a table with specific information recorded about the report including:
    • Report Purpose: A general description of what the report is and how it should be used.
    • Data Source: name of the server, database, or application where the data in the report came from.
    • Author: who wrote the report, and how to contact them with questions.
    • Security: Security requirements of the report.  Does it contain confidential information? Who or what security groups are allowed to run the report.
    • Business Logic: List of information that may be critical to interpreting the report.  For example, “Annual sales figures are calculated based on the fiscal year of July 1 to June 30 in this report”.  Also list any abbreviations that everyone may not know.  For example, “AHG = Average Height by Gender (I made that up)”.
    • Assumptions: What assumptions, if any, were made while writing the report.   I have also seen some report authors insert any user defined variables here.  This way when someone runs a report and they think that it is wrong, the report author can see exactly what values the user entered into the prompt boxes as a point to start troubleshooting.

The documentation page on a report becomes a valuable tool for anyone trying to understand a report and also very useful for someone who is troubleshooting a report.  Of course it is also helpful when an end user does not delete the documentation page…..

My First TSQL Tuesday

This is my first TSL Tuesday post, and actually my first run at blogging. And on a special new day too! I have been reading other’s blogs for about a year now and thought that perhaps I could contribute something that helps someone else out one day.

This month’s T-SQL Tuesday challenge: reveal your crap to the world. Why is (or was) it crap? Why did you do it? And how did you learn from your mistake?

Yup, I’ll admit it…I’ve written lots of crap code! I didn’t sit down one day and decide to write some crap code just for the fun of it. I wrote crap code because I didn’t know a better way to solve the problem at the time.

There have been plenty of times that I have gone back to review code that I wrote in the past and said, “Matt, why on earth did you do that?!?!” I don’t have any specific examples to share though because I try to fix the bad code as quickly as I can and burn the old code.

Some things that I have learned the hard way over the years:
1. Don’t believe everything you read on the internet.

  • Make sure you are using a reputable source. I’ve seen some crazy solutions posted online to solve simple problems.
  • As I’ve read multiple times, Trust but Verify. Even if you find code to solve a problem from a reputable source there is always a chance that they made a mistake or syntax error in their code, were making an April Fools joke, or were flat out wrong about something.

2. Comment, Comment, Comment, Comment.

  • And then make a comment about your comments.
  • When you write bad code, including comments to explain what you are trying to accomplish will make your life a lot easier to try to remember what your goal was 6 months or 2 years ago.

3. If you get stuck, walk away.

  • No, not forever…I mean take a break. I have had plenty of times that I was working on a script or a query that seemed very complicated. The length of code kept growing, and the entire thing kept getting more complex. What did I do? I saved the code, stopped thinking about it, went home and got a good night’s sleep. The next morning when I pull the code up, I can usually see that it is too complex, and bang out a much simpler solution that performs much better.

4. Find a mechanism to test your code. In SQL Server, there are many ways to test how well your code is performing.

  • SET STATISTICS profile On (BOL) and SET STATISTICS time ON (BOL) to see measure how long it takes for the code to run
  • Check the box in SSMS to include the actual execution plan, combined with the awesome tool SQL Sentry Plan Explorer
  • And absolutely don’t forget to test that the output matches what you started with (or fixes an error in output). I have been embarrassed more than once when I *fixed* a report, only to have someone point out to me that there was now data missing from the new version.

5. Keep on learning.

  • While I know enough to function and get by day to day, I know that there is still a heck of a lot that I don’t know.
  • When I can find free time, I read other’s blogs. I almost always learn something new in the process about SQL Server, server performance, security, or tips and tricks to write better code.
  • Here are some resources that I have found invaluable in my career as a DBA. (So as not to offend anyone, my disclaimer is that there are plenty more resources that I have found to be invaluable, this is a short list.)
  1. Pinal Dave (blog)– As I was getting started, almost every question that I googled had Pinal on the first page of results.
  2. Brent Ozar PLF (blog)
  3. Thomas Larock (blog)
  4. Blog aggregates: SQLServerCentral.com (rss) and SQLServerPedia (rss)
  5. Microsoft SQL Server Books Online: http://msdn.microsoft.com/en-us/library/bb545450.aspx