THE SQL Server Blog Spot on the Web

Welcome to SQLblog.com - The SQL Server blog spot on the web Sign in | |
in Search

Andy Leonard

Andy Leonard is an author and engineer who enjoys building and automating data integration solutions.

  • Quick Regions in SQL Server Management Studio (SSMS)

    There are a few tools and add-ins that support the creation of “regions” in T-SQL code. SQL Server Management Studio (SSMS) supports one way to separate sections of long-ish T-SQL scripts natively, by using begin/end:

    To demonstrate, start with a snippet of T-SQL that includes begin/end:

    SSMSRegion0

    If you click on the “-“ highlighted above, the code will collapse and appear as show here:

    SSMSRegion1

    If you add a comment…

    SSMSRegion2

    Collapsing displays your comment, creating a quick “region” in your T-SQL:

    SSMSRegion3

    :{>

  • Data Wrangling Concepts: Latency and Staleness

    Latency is the amount of time between data creation and load. Staleness is the amount of time since data in a target was refreshed.

    In many data integration use cases, a latency or staleness of one day, one week, or even one month is acceptable. Enterprise Data & Analytics helps enterprise data integration teams reduce latency and staleness. We often help customers seek near-real-time solutions – or near 0 latency and staleness.

    How do we approach a solution?

    1. Tuning
    2. Re-architect
    3. Redesign

    Tuning

    If the source schemata are stable, we begin tuning at the sources and destinations. Most targets and many sources are data stores. If we’re using SQL Server Integration Services (SSIS) for data integration, many find it odd that we begin at the source and destinations before tuning the SSIS packages. We start with the databases because we often (not always) see more improvement from database tuning than from SSIS tuning.

    If the source schemata are not stable, we examine how the enterprise is managing this volatility.

    Re-architect

    Some enterprises manage volatile source schemata manually. Employing a Biml-Driven Architecture (BDA) automates some of the manual effort. A BDA can often alleviate all of the manual work.

    BDA can also facilitate loading a common target with dozens (or hundreds) of sources.

    Perhaps most intriguing BDA can reduce to time-to-analyze data, allowing data scientists to begin experimenting with the data sooner and, hopefully, expediting results.

    Redesign

    Support and maintenance is too often overlooked when considering latency and staleness. Consider while your team is troubleshooting an issue with the enterprise data integration solution, the data in the target is becoming more stale and latency is increasing at a rate of one minute per minute. Employing design patterns is one way to reduce support and maintenance overhead. If most (or all) of your data integration solution employs the same (or similar) design pattern(s), understanding one package means team members understand many (or all) package(s). Implementing a BDA once design patterns are known adds even more efficiency to enterprise data integration support and maintenance.

    Software design best practices cannot be overemphasized. Separation of concerns, decoupling, coding to contracts, testing, source control, and Data Integration Lifecycle Management (DILM) save more time and money than business owners realize.

    Conclusion

    At Enterprise Data & Analytics, we help enterprises build faster data integration solutions, build data integration solutions faster, and make data integration execution more manageable. If you are interested in learning more, please contact us.

    :{>

    Learn More:
    Biml in the Enterprise Data Integration Lifecycle (Password: BimlRocks)
    From Zero to Biml - 19-22 Jun 2017, London 
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – 2-6 Oct 2017, Chicago

    Tools:
    SSIS Framework Community Edition
    Biml Express Metadata Framework
    SSIS Catalog Compare
    DILM Suite

  • Big Data vs. Sampling

    BIGdata

    Merriam-Webster defines sampling as:

    1. the act, process, or technique of selecting a suitable sample; specifically :  the act, process, or technique of selecting a representative part of a population for the purpose of determining parameters or characteristics of the whole population

    2. a small part selected as a sample for inspection or analysis ask a sampling of people which candidate they favor

    In statistics, sampling is the practice of viewing or polling a representative subset of a population. Sampling generates statistical error so that results are often published as some value +/- some error range, i.e. 47% +/- 5%. It’s possible to produce accurate but useless statistical results in this manner. For example, if the result of a political poll in a race between two candidates is 47% +/- 5%, one interpretation is “it’s too close to call.” On second thought, that’s not useless. But it may be less useful for a candidate who desires to spend large sums of money to prepare for a victory celebration.

    The Vs. Part…

    Sampling informs analysts and data scientists of an approximation and a range of potential error. Sampling says we don’t need to poll every individual to achieve a result that is “good enough.”

    “Big data” attempts to poll every individual.

    The Problem I am Trying To Solve

    Is more data better? In his 2012 book, Antifragile, Nassim Nicholas Taleb (fooledbyrandomness.com | @nntaleb) – the first data philosopher I encountered – states:

    “The fooled-by-data effect is accelerating. There is a nasty phenomenon called ‘Big Data’ in which researchers have brought cherry-picking to an industrial level. Modernity provides too many variables (but too little data per variable), and the spurious relationships grow much, much faster than real information, as noise is convex and information is concave.” – Nassim Nicholas Taleb, Antifragile, p. 416

    According to Taleb, there’s a bias for error embedded in big data; more is not better, it’s worse. I’ve experienced this with business intelligence solutions and spoken about data quality in data warehouse solutions, saying:

    “The ratio of good:bad data in a useless / inaccurate data warehouse is surprisingly high; almost always north of 95% and often higher than 99%.”

    The Solution

    The solution to bad data has always been (and remains) data quality. “Just eliminate the inaccurate data” sounds simple but it’s not an easy problem to solve. In data science, data quality is the next-to-the-longest long pole.  (Data integration is the longest long pole.) The solution for the first and second longer poles in data science is the same: automation.

    At Enterprise Data & Analytics, we’re automating data quality. I mention it not by way of advertisement (although a geek’s gotta eat), but to inform you of another research focus area in our enterprise. Are we the only people trying to solve this problem? Goodness no. (That would be tragic!) We are focused on automating data science. You can learn more about our solutions for automating data wrangling at DILM Suite.

    :{>

    Learn More:
    Biml in the Enterprise Data Integration Lifecycle (Password: BimlRocks)
    From Zero to Biml - 19-22 Jun 2017, London 
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – 2-6 Oct 2017, Chicago

    Tools:
    SSIS Framework Community Edition
    Biml Express Metadata Framework
    SSIS Catalog Compare
    DILM Suite

  • The Recording for Biml in the Enterprise Data Integration Lifecycle is Available!

    Automate All The Things!

    The recording for the webinar Biml in the Enterprise Data Integration Lifecycle is now available (registration required). The password for the meeting is BimlRocks.

    You may be reading this post and thinking, “But Andy, I don’t recall seeing anything about this webinar until just now.” That’s entirely possible. I advertised this webinar in advance, but only to members of the DILM Mailing List. The DILM Mailing List is the best way to keep up with what I’m up to these days. I’ll post most of what I’m working on and thinking about here, but not all and rarely first. If you want to keep up with everything and/or learn of it first, the DILM Mailing List is going to be your friend.

    Enjoy the webinar!

    :{>

    Learn More:
    From Zero to Biml - 19-22 Jun 2017, London 
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – 2-6 Oct 2017, Chicago

  • Changes to PASS Summit Program Selection Process

    Wendy Pastrick [Blog | @wendy_dance], (my friend) who leads the PASS Program effort on the PASS Board, recently wrote Changes to the PASS Summit 2017 Program, Pre-Conference Call for Interest, and a Community Survey. The PASS Summit 2017 Program will be somewhat different from previous years. “Different in what ways, Andy?” Excellent question, and the answer is, “It depends.” (You saw that coming, didn’t you?)

    It depends (some, at least) on your feedback which you can provide by completing the PASS Summit 2017 Program Survey. The survey should take no more than five minutes to complete. You can even win a three-day pass to the Summit! I encourage everyone in the SQL Server Community to take five minutes to respond to the survey. I did.

    There are two town hall online meetings scheduled for 13 April. Registration is required and I’ve signed up for the earlier meeting.

    Changes to PASS Summit 2017 Speaker Selection Process

    Another chunk of Wendy’s blog post is dedicated to changes to the Summit speaker selection process:

    Experienced community speakers with a consistent track record for highly rated Summit sessions will be invited to work closely with the Program team to develop updated sessions for specific content areas. PASS is also looking to invest in a small number of new Summit speakers to further strengthen the educational value of the event.

    Selecting speakers with a consistent track record of highly-rated Summit presentations mostly makes sense to me. It fits with the desire to “invest in a small number of new Summit speakers.” As I mentioned in PASS Summit 2016 Speakers last year, my favorite statistic from Allen White’s [Blog | @SQLRunr] PASS Blog post Insights into the PASS Summit Speaker Selection Process was 20% of the speakers selected were first-time presenters. I interpret the quote from Wendy’s post to indicate the number of first-time speakers at the PASS Summit 2017 will be south of 20%. That’s probably ok. It’s probably more ok to have this metric bounce around from year-to-year so that new Summit speakers are introduced to PASS’s audience and gain experience speaking at a national event. From a practical perspective, it’s either that or clone Conor Cunningham, Jen Underwood, Bob Ward, Jen Stirrup, Brent Ozar, Lara Rubbelke, Paul Randal, Kimberly Tripp, Buck Woody, Jessica Moss, Grant Fritchey, Adam Machanic, Steve Jones, Denny Cherry, and other popular speakers.

    Changes to PASS Summit 2017 Precon Selection Process

    A bigger change appears afoot in the Summit precon selection process. I cannot locate a list of pre-qualifications (yet). Instead, interested parties are asked to “please fill out this call for interest form and email it to our Program team by April 7 at 7 AM UTC.”

    Highly-rated, experienced full-day speakers with expert knowledge of their key topic area will be considered.

    It’ll be interesting to see how this plays out.

    PASS Summit 2017 Call for Presentations

    Wendy indicates the PASS Summit 2017 Call for Presentations will open in May. The CFP last year closed in early March (something like 2 Mar, if memory serves), which was… inconvenient for the Summit and presenters who wanted presentations about the new stuff – SQL Server 2016 – and it wasn’t released for General Availability until 1 Jun 2016. (I joked the PASS Summit 2017 CFP would likely open the week before the PASS Summit 2016… I was wrong.) Holding the CFP in May makes more sense for presenters and attendees.

    As For Me…

    <transparency>
    As I mentioned in PASS Summit 2016 Speakers I wasn’t selected to present at the PASS Summit 2016. Because I had a lot of other things on my plate I opted to not attend last year. I was able to watch the keynotes online and probably had a better view of them than if I’d been there in person. I missed hanging out with many friends that I typically see only once per year, but I was surprised that that was the only negative part of not attending. To be sure, missing reconnecting with friends is not trivial but I thoroughly enjoyed staying home last November – more than I thought I would. This year I’ve mulled simply not submitting. Truth be told, I’m still mulling not responding to either the Precon Interest Process (I downloaded the form, completed it, and… saved it while I think about it some more) or the Call for Presentations although I’m more open now than before to submitting and letting the chips fall where they may. Regardless of where the chips fall, attending the PASS Summit – although it’s a good event with lots of opportunity to network, make new friends, and catch up with old friends – has lost some of its appeal for me in recent years.
    </transparency>

    Andy

    Take Action:
    1. Read Wendy’s post: Changes to the PASS Summit 2017 Program, Pre-Conference Call for Interest, and a Community Survey
    2. Invest 5 minutes in PASS: PASS Summit 2017 Program Survey
    3. Join one of the town hall meetings: Registration
    4. If you’re interested in delivering a precon: Please fill out this call for interest form and email it to the PASS Program team by April 7 at 7 AM UTC.
  • Please Blog

    My name is Andy and I’m a blogger.

    I encourage you to blog, too. Why? Because you know stuff. I need to learn some of the stuff you know. You’ve experienced stuff. I’d like to read your experiences – again, to learn from them. Others feel the same way.

    “I don’t have anything to say.”

    That’s simply not true. You have plenty to say. Maybe it’s hard to get started. I get that. So here’s a starter post for you: Write a blog post about starting a blog. If you want you can mention this post. It’s ok if you’d rather not. But write that post.

    “I don’t know where to go to get started.”

    Search Google or Bing or your favorite search engine for “how to start a blog” or just click one of those links (I’ve already saved the search). Why don’t I just tell you how to start a blog? You’re going to have to learn how to learn (if you don’t already know how to learn). To paraphrase a famous person, “blogger, teach thyself.”

    I only know how to start one blog (well, a couple, but still). I don’t know how to start your blog. You’ll have to figure that out on your own. I’ve given you one idea for one post (above).

    “All the good stuff has already been written.”

    Probably not. Think about this: If you were writing this very blog post instead of me, would you use the same grammar? Punctuation? Alliteration? Someone reading this post is experiencing the equivalent of nails on chalkboard because of my writing style. (I know you’re there. Thank you for reading anyway. I love you. I’m sorry.)

    Even if all the good stuff has been written, it hasn’t been written in your style, in your voice.

    You ran into that issue just the other day, didn’t you? The one you had to search to find the solution? That quirky error with the error message that might as well have been written in ancient Sumerian for all the help it provided? Write about that. If another blog post helped you, link to that blog in your post. Most bloggers appreciate the link.

    “I’m too busy.”

    Really? How much television do you watch per week?

    Peace,
    Andy

  • The Recording for the SSIS Catalog Compare Version 2 Launch Event is Available

    /The recording for the SSIS Catalog Compare Version 2 Launch Event is now available for viewing (registration required).

    Enjoy!

    :{>

    Learn More:

    SSIS Catalog Compare
    Join the DILM Suite Mailing List

  • Why Automate?

    Because, as Jen Underwood (jenunderwood.com | LinkedIn | @idigdata) states in an upcoming podcast: The future of data science is automation.

    If automation is the future, how do we decide what to automate? We look for the long pole. What’s the long pole in data science?

    DataWranglingQuoteUnicorn

    Data Integration is the Long Pole

    According to Lars Nielsen in his book Unicorns Among Us:

    “As intellectually stimulating as the work might be, there are also mundane aspects to the data scientist's job. In fact, there is pure drudge work that can take from 50 to 80 percent of a data scientist's time: This involves the collecting, cleaning and organizing of unruly, disparate, messy, unstructured data before it can be mined for the gold of actionable BI. Some call this aspect of the job ‘data wrangling.’" (emphasis mine)

    The long pole in data science is “data wrangling” or “data munging,” also known as data integration.

    “How Do We Automate Data Integration, Andy?”

    I’m glad you asked! My answer is:

    1. Eliminate redundancy.
    2. Automate execution.
    3. Practice DILM (Data Integration Lifecycle Management).

    First, eliminate the cut-and-paste drudge work from data integration development.

    What is the “cut-and-paste drudge work?” A significant amount of data integration is based on repeatable patterns. One can automate SSIS design patterns, such as truncate-and-load and incremental load, using Biml (Business Intelligence Markup Language). Hours, days, and weeks of work can be reduced to minutes by using Biml and Biml Frameworks, such as BimlFlex and the Biml Express Metadata Framework.

    Second, automate data integration execution.

    Data integration execution has long been grouped into processing jobs or batches. A best practice in SSIS development is to build small, unit-of-work packages that perform specific load or transformation operations. Why? Smaller packages are easier to develop, test, manage, and maintain. Unit-of-work packages promote code re-use, as some packages can be used in multiple batched operations.

    There’s no free lunch, and building unit-of-work packages presents a new issue: Now one has lots more SSIS packages to execute. An execution framework addresses this issue. Enterprise Data & Analytics offers a free execution framework, the SSIS Framework Community Edition.

    The SSIS Framework Community Edition includes a Parent SSIS package that executes packages collected into batches called “SSIS Applications.” Metadata about SSIS Applications is stored in tables integrated (although isolated by a custom schema) into the SSISDB database. Data integration professionals can configure metadata for three (or three hundred) packages in a single batch, and this batch can be executed by starting the Parent SSIS package and overriding a single parameter.

    Consider this quote in a CNN article from a senior official with Obama 2012 US re-election campaign:

    Obama2012DataIntegration

    Third, data integration needs enterprise lifecycle management, like all other software development. (Data integration development is software development, after all.)

    Repeatable, collaborative, and communicable processes form the heart of enterprise DevOps. Repeatable releases and source control for SSIS are no longer optional because they improve code quality and reduce downtime. Enterprises need at least three lifecycle “tiers” – Development, Test, and Production. Why? Development and Production environments are usually not in question; what about this third tier? It’s not important what this tier is called – or even if there are more tiers between Development and Production. This tier is important because it’s not Production and not Development.

    All software works in Development. Software is built in Development and the design-time defaults all point to Development resources. Enterprises do not want the first deployment of any software to be the Production deployment. Instead, a test deployment – to a different environment (not Development and not Production) – will assure all external parameters are properly configured and included in the deployment plan. A successful test deployment to an environment (lifecycle tier) that matches Production dramatically improves confidence that the Production deployment will succeed.

    When deploying SSIS to the SSIS Catalog, though, you need to be sure your test deployment tier is closely aligned to the Production environment. That can be… challenging, but SSIS Catalog Compare detects – and can script and deploy – differences between SSIS Catalogs residing in different Data Integration Lifecycle tiers. Catalog Compare generates scripts for externalized parameters – parameters that override the design-time defaults – by scripting SSIS Catalog Environments, Environment Variables, Project and Package References, and Reference Mappings.

    Conclusion

    Why is automation important? Automating data integration changes the dynamics for data science.

    Jen’s right. Lars is right. Automation is the future of data science and automating the long pole – data integration – is the place to begin. For SSIS automation, DILM Suite is a solution.

    What can we accomplish by automating data science? We can change the world.

    :{>

    Learn More:
    From Zero to Biml - 19-22 Jun 2017, London 
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – April 2017, Chicago 
    IESSIS2: Immersion Event on Advanced Integration Services – Oct 2017, Chicago

    Tools:
    SSIS Framework Community Edition
    Biml Express Metadata Framework
    SSIS Catalog Compare
    DILM Suite

    Recordings and Posts:
    SSIS Lifecycle Management
    Advanced SSIS Execution
    SSIS Academy: Using the SSIS Catalog Day 1 - Create the Catalog and Deploy
    SSIS Academy: Using the SSIS Catalog Day 2 - Package Execution and Monitoring

    The Data Integration Lifecycle Management (DILM) Blog Series:
    An Example of Data Integration Lifecycle Management with SSIS, Part 0
    An Example of Data Integration Lifecycle Management with SSIS, Part 1
    An Example of Data Integration Lifecycle Management with SSIS, Part 2
    An Example of Data Integration Lifecycle Management with SSIS, Part 3
    An Example of Data Integration Lifecycle Management with SSIS, Part 4

  • Parsing SSIS Catalog Messages for Lookup Performance, v2

    A couple years ago I wrote Administering SSIS: Parsing SSIS Catalog Messages for Lookups. I’ve updated the script to return all Lookup Transformation performance messages captured by the SSIS Catalog with Basic logging enabled (the SSIS Catalog’s default Logging Level). Some smart data integration people recommend setting the SSIS Catalog Logging Level to Verbose and querying catalog schema views to retrieve row counts and performance metrics. I’m not a fan of running a Verbose logging level by default due to the overhead. Granted, the overhead is relatively light but it’s not nothing. Sometimes I need all the cycles!

    I prefer to passively gather performance metrics – even if it’s a little convoluted and more work (for me). I don’t often need these metrics immediately available, so I execute queries similar to the query below and store the performance metrics in a table. I can (and do) build dashboards to track SSIS performance (perhaps I should blog about them…) using this passively-collected performance metrics.

    Without further ado, the updated query:

    Use SSISDB
    go

    declare @LookupIdString varchar(100) = '% rows in the cache. The processing time was %'
    declare @LookupNameStartSearchString varchar(100) = '%:Information: The %'
    declare @LookupNameStartLen int = Len(@LookupNameStartSearchString) - 2
    declare @LookupNameEndSearchString varchar(100) = '% processed %'
    declare @LookupNameEndLen int = Len(@LookupNameEndSearchString) - 2
    declare @ProcessingTimeString varchar(100) = 'The processing time was '
    declare @ProcessingTimeSearchString varchar(100) = '%' + @ProcessingTimeString + '%'
    declare @CacheUsedString varchar(100) = 'The cache used '
    declare @CacheUsedSearchString varchar(100) = '%' + @CacheUsedString + '%'

    Select
    SubString(om.[message]
            , (PatIndex(@LookupNameStartSearchString, om.[message]) + @LookupNameStartLen)
            , (PatIndex(@LookupNameEndSearchString, om.[message]) - (PatIndex(@LookupNameStartSearchString, om.[message]) + @LookupNameStartLen))
              ) As LookUpName
    , Convert(bigint, Substring(om.[message]
                              , (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen)
                              , ((CharIndex(' ', om.[message], PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen))
                                  -
                                 (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen)
                                )
                               )
             ) As LookupRowsCount
    , Convert(decimal(16, 3), Substring(om.[message]
                                      , (PatIndex(@ProcessingTimeSearchString, om.[message]) + Len(@ProcessingTimeString) + 1)
                                      , ((CharIndex(' ', om.[message], PatIndex(@ProcessingTimeSearchString, om.[message]) + Len(@ProcessingTimeString) + 1))
                                         -
                                         (PatIndex(@ProcessingTimeSearchString, om.[message])+ Len(@ProcessingTimeString) + 1)
                                        )
                                       )
             ) As LookupProcessingSeconds
    , Convert(bigint, Substring(om.[message]
                             , (Patindex(@CacheUsedSearchString, om.[message]) + Len(@CacheUsedString) + 1)
                             , ((Charindex(' ', om.[message], Patindex(@CacheUsedSearchString, om.[message]) + Len(@CacheUsedString) + 1))
                                 -
                                (Patindex(@CacheUsedSearchString, om.[message]) + Len(@CacheUsedString) + 1))
                               )
             ) As LookupBytesUsed
    , Convert(decimal(16, 3), (Convert(bigint, Substring(om.[message]
                                                       , (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen)
                                                       , ((CharIndex(' ', om.[message], PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen))
                                                          -
                                                          (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen))
                                                        )
                                      )
                                     /
                                      (Convert(decimal(16, 3), Substring(om.[message]
                                                                       , (PatIndex(@ProcessingTimeSearchString, om.[message]) + Len(@ProcessingTimeString) + 1)
                                                                       , ((CharIndex(' ', om.[message], PatIndex(@ProcessingTimeSearchString, om.[message]) + Len(@ProcessingTimeString) + 1))
                                                                          -
                                                                          (PatIndex(@ProcessingTimeSearchString, om.[message])+ Len(@ProcessingTimeString) + 1 ))
                                                                        )
                                               )
                                      )
                               )
            ) As LookupRowsThroughput
    , ex.start_time As ExecutionStartTime
    , ex.folder_name As FolderName
    , ex.project_name As ProjectName
    , ex.package_name As PackageName
    --Into rpt.LookupMetrics -- drop table rpt.LookupMetrics
    From [catalog].operation_messages om
    Join [catalog].executions ex
      On ex.execution_id = om.operation_id
    Where om.message_source_type = 60 -- Data Flow Task
      And om.[message] Like @LookupIdString
      -- data type-checking
      And IsNumeric(Substring(om.[message], (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen ), ((CharIndex(' ', om.[message], PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen)) - (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen)))) = 1 -- Check LookupRowsCount
      And IsNumeric(Substring(om.[message], (PatIndex(@ProcessingTimeSearchString, om.[message]) + Len(@ProcessingTimeString) + 1 ), ((CharIndex(' ', om.[message], PatIndex(@ProcessingTimeSearchString, om.[message]) + Len(@ProcessingTimeString) + 1)) - (PatIndex(@ProcessingTimeSearchString, om.[message])+ Len(@ProcessingTimeString) + 1 )))) = 1 -- Check LookupProcessingSeconds
      And IsNumeric(Substring(om.[message], (Patindex(@CacheUsedSearchString, om.[message]) + Len(@CacheUsedString) + 1 ), ((Charindex(' ', om.[message], Patindex(@CacheUsedSearchString, om.[message]) + Len(@CacheUsedString) + 1)) - (Patindex(@CacheUsedSearchString, om.[message]) + Len(@CacheUsedString) + 1)))) = 1 -- Check LookupBytesUsed
      -- catch 0-valued denominator rows
      And Convert(bigint, Substring(om.[message], (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen ), ((CharIndex(' ', om.[message], PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen)) - (PatIndex(@LookupNameEndSearchString, om.[message]) + @LookupNameEndLen)))) <> 0 -- Check LookupRowsCount
    Order By operation_id DESC

    :{>

    Related Training:
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – April 2017, Chicago 
    From Zero to Biml - 19-22 Jun 2017, London
    IESSIS2: Immersion Event on Advanced Integration Services – Oct 2017, Chicago

    Learn More:
    SSIS Design Patterns and Biml: A Day of Intelligent Data Integration – Boston SQL Saturday precon, 24 Feb 2017
    Save Time and Improve SSIS Quality with Biml
    An Example of Data Integration Lifecycle Management with SSIS, Part 4
    The Recordings for SSIS Academy: Using the SSIS Catalog are Available
    SSIS Catalog Compare v2.0 Launch Event 7 Mar 2017!

  • Presenting Use Biml to Automate SSIS Design Patterns at SQL Saturday–Richmond!

  • Presenting Use Biml to Automate SSIS Design Patterns at SQL Saturday–Raleigh!

  • Creating SSIS Packages with the SQL Server Import and Export Wizard

    This material was originally posted on the Linchpin People blog.

    In this post, I demonstrate how to use the SQL Server 2012 Import and Export Wizard to create a SQL Server 2012 Integration Services (SSIS 2012) package. The demo is created on a virtual machine running Windows Server 2012.

    You can use the SQL Server Import and Export Wizard to learn more about SQL Server Integration Services. Often, one will reach the “Save and Run Package” just before executing the Wizard. The “Run immediately” checkbox is checked and the “Save SSIS Package” checkbox is unchecked by default. If you check the “Save SSIS Package” checkbox, you can select to store the SSIS package in the MSDB database (the “SQL Server” option) or File system. You may also select the Package Protection Level:

    ImportExportWizard1

    If you select these options, the next step is selecting the target location of the package:

    ImportExportWizard2

    The next page of the Import and Export Wizard displays a summary of the selected options:

    ImportExportWizard3

    Clicking the Finish button creates, saves, and executes the SSIS package as configured, and the results are displayed:

    ImportExportWizard4

    Cool. Now what?

    Now you have an SSIS package saved in the file system. In this case, the file is named ImportExportWizard.dtsx. It can be opened in SQL Server Data Tools by right-clicking the file and selecting “Edit”:

    ImportExportWizard5

    Windows Server 2012 prompts for which program to use to perform the Edit. I select Microsoft Visual Studio Version Selector:

    ImportExportWizard6

    SQL Server 2012 Data Tools uses the Visual Studio 2010 Integration Development Environment (IDE) at the time of this writing. Note the “Run” (Play) button is disabled on the toolbar:

    ImportExportWizard7

    We have an SSIS package created with the Import and Export Wizard, but we cannot re-execute it.

    Bummer. But all is not lost.

    Visual Studio needs a debugger to execute packages in debug mode. When we open an SSIS (*.dtsx) file all by itself, Visual Studio doesn’t load a debugger. To have Visual Studio load the SSIS debugger, we can create a Solution containing a Project. Begin by clicking File, New, Project:

    ImportExportWizard8

    Name the new solution and project, and select a folder:

    ImportExportWizard9

    Once the solution and project are created, right-click the SSIS Packages virtual folder in Solution Explorer, and select “Add Existing Package”:

    ImportExportWizard10

    Navigate to the location where you stored the SSIS package in the final steps of the Import and Export Wizard:

    ImportExportWizard11

    When you click the OK button, the SSIS package is copied into the project folder and added to the new project:

    ImportExportWizard12

    When you open the SSIS package in the context of a Visual Studio project and solution, the Debug button is enabled on the toolbar and the SSIS package can be executed in debug mode:

    ImportExportWizard13

    You can use your knowledge of the Import and Export Wizard to learn more about SSIS package development. Happy Integrating!

    :{>

  • Long Poles and Critics

    OldNewThingCoverAs a consultant, I get calls to complete projects started by someone else or extend projects completed by someone else. When I look at someone else's work it's sometimes tempting to say, "Wow - they did that wrong." But I don't. Instead I say, "I'm not sure why they built it this way." That may sound back-handed but I make sure it's not by asking follow-up questions. Which questions? My favorite is, "What was the problem they were trying to solve?" It’s entirely possible my predecessor delivered precisely what he was asked to deliver. Plus software projects evolve (especially elegant software projects). If software solves a problem it's common for "new" problems to come into focus.

    We speak about this in terms like "long pole," the subject of this article by Raymond Chen, gifted thinker and author of The Old New Thing (which, I am embarrassed to admit, just found it’s way onto my Kindle). If I'm taking down a tent, I may decide to take down the tallest (longest) pole first. That makes the tent noticeably shorter and provides clear evidence to onlookers that I'm doing my job (aka “highly visible”). But then, *another* pole becomes the long pole. And the cycle repeats.

    Some things to keep in mind before criticizing:

    1. Delivering software is a collaboration between the service provider and the customer. It's never 50/50%. There's always imbalance even if it's as little as 49/51% – and this applies to success as well as failure. If you criticize – especially early-on – you may be speaking to the person on the 51% side of a failure. You may be unwittingly feeding the beast with all new criticisms, which leads to my next consideration...
    2. If I criticize the work of others, I am priming my customer to criticize the next bit of work she sees. Who's on deck? Me and my work.
    3. “But what if the person before me did something truly horrible, Andy?” That’s a fair question and I have a question for you, “Are you aware of 100% of the factors that went into the decisions made by your predecessor?” I’m certain the answer is “no.” Are you aware of 50%? Again, no. At best, you’re speaking to one side of an incomplete project. You will most likely have no opportunity to speak with your predecessor and the person with whom you are speaking is not going to tell you all of their side. You’re not going to get even half of the story! Now, your predecessor could have delivered something dangerous, illegal, insecure, of poor quality, not up to modern standards and best practices, or merely a solution of which you do not approve. They could well and truly be 100% to blame. Your customer may indicate that they want you to disparage the work of your predecessor. I advise you to resist the temptation to do so. Again, my advice is to fall back to “I don’t understand why they built it this way,” or (perhaps), “Based on what I’ve heard today, I would have designed this differently,” or, “I would have delivered the solution so that ______.” Why? Because you don’t know the other side of the story.

    Maybe I’m wrong about this. Maybe you’ve built a reputation as an expert by disparaging the work of others thinking that you will get all of the work and everyone will think you’re a rock star. Maybe. Or maybe John Nash and I are right about coopetition, and there’s more work out there than you can handle alone and that you have, unwittingly, introduced errors into your deliverables and primed your customers to criticize the mistakes of consultants.  (Even you.)

    Time will tell.

    Peace.

    :{>

    Learn More:
    SSIS Design Patterns and Biml: A Day of Intelligent Data Integration – Boston SQL Saturday precon, 24 Feb 2017
    Save Time and Improve SSIS Quality with Biml
    An Example of Data Integration Lifecycle Management with SSIS, Part 4
    The Recordings for SSIS Academy: Using the SSIS Catalog are Available
    SSIS Catalog Compare v2.0 Launch Event 7 Mar 2017!

    Related Training:
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – April 2017, Chicago 
    IESSIS2: Immersion Event on Advanced Integration Services – Oct 2017, Chicago
    From Zero to Biml - 19-22 Jun 2017, London

  • The Last SSIS Book You Read

    SSISDesignPatternsCoverEvery now and then I get an email or meet someone at an event and they say something like, “I learned SSIS from your book!” I’m not going to lie to you, that makes me feel pretty good. Since I’ve written most of my books as part of a team, I usually reply, “It was a team effort and our team was a bunch of really smart people… plus me!”

    These folks tell me they read other books about SSIS (SQL Server Integration Services) but they didn’t really grok SSIS until they read the book authored by the team and me. I suppose that could mean the team and I are awesome. I believe that’s mostly true, but I am definitely biased…

    Here’s what I believe is really happening, though. I think these folks – and many others I’ve never met – learned more than they realized from those other books. I think our book was the last or latest SSIS book they read. I think the other books exposed these readers to complex data integration concepts, perhaps for the very first time. I still remember learning data integration and it was painful for me – and I believe I have a knack for it!

    I think our book is merely the book they read after the others. I’m willing to bet other folks have read our books first, then read books by other authors, and told those authors the same thing. I’m cool with that. As an author, I’m happy you’re reading books about the topic.

    :{>

    Learn More:
    SSIS Design Patterns and Biml: A Day of Intelligent Data Integration – Boston SQL Saturday precon, 24 Feb 2017
    Save Time and Improve SSIS Quality with Biml
    An Example of Data Integration Lifecycle Management with SSIS, Part 4
    The Recordings for SSIS Academy: Using the SSIS Catalog are Available
    SSIS Catalog Compare v2.0 Launch Event 7 Mar 2017!

    Related Training:
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – April 2017, Chicago 
    IESSIS2: Immersion Event on Advanced Integration Services – Oct 2017, Chicago
    From Zero to Biml - 19-22 Jun 2017, London

  • Intelligent Data Integration, SSIS Design Patterns, and Biml

    On Thursday, 15 Jun 2017, I have the privilege and honor of delivering a full-day precon titled SSIS Design Patterns and Biml: A Day of Intelligent Data Integration as part of SQL Saturday Dublin. If you’re interested in learning more or attending, you can learn more here.

    “Isn’t This The Same Presentation You Delivered Before, Andy?”

    Yes and no. It has the same title but…

    I’ve focused on Biml presentations for the past two years. Over the past eighteen months I’ve built the DILM Suite. These facts intersect: My goal is to facilitate DevOps and Continuous Integration (CI) with SQL Server Integration Services (SSIS) and Biml plays an important role; namely, automated SSIS code generation. The DILM Suite development work has impacted my webinars and presentations – especially this precon. I delivered SSIS Design Patterns and Biml: A Day of Intelligent Data Integration once before, over a year ago in Atlanta. Since then I’ve delivered modules of that presentation in Enterprise Data & Analytics webinars. With each delivery the DILM Suite development work has informed and inspired changes to the content of the modules; the content has evolved and the 15 Jun delivery will be different.

    This evolution-of-content has happened to many of my Biml presentations. In some cases the updates are such that today’s version of the presentation is a radical departure from the first delivery. Why? I’m constantly learning. Writing the DILM Suite has intensified my learning. As I’ve shepherded this vision and watched it come to fruition, I’ve discovered new possibilities and more use cases.

    “Mini-Cycles”

    I catch a glimpse of what’s possible and develop until it’s reality. As I develop, the glimpse becomes more defined and I add and refine features in response. This “mini-cycle” continues until I reach a good stopping point with a solution, product, or utility. By then I’ve caught a glimpse of a solution to another problem and begin developing a different solution… and the mini-cycle repeats for this other solution, product, or utility.

    With DILM Suite, I catch a glimpse of a Euler diagram (I think visually, in graphs) showing how two or more of the solutions, products, and utilities work together to facilitate more complex DevOps and SSIS CI scenarios. This started in early 2016 around the time I began releasing a handful of free utilities. There will be more free utilities (one within the next 30 days) but the list of free DILM Suite stuff at the time of this writing is:

    The blog post titled An Example of Data Integration Lifecycle Management with SSIS, Part 4 provides a glimpse of how one might use four of these free tools together (everything except the Biml Express Metadata Framework – which hadn’t been released at that time). Today, at the time of this writing, that glimpse is my latest “pinnacle.” The Euler’s in my mind, though, are already two pinnacles beyond that – and working on a 3rd. It’s likely the 15 Jun delivery of the SSIS Design Patterns and Biml: A Day of Intelligent Data Integration precon will contain material beyond these five free tools.

    The delivery after 15 Jun will likely contain still more material. I’m continuously integrating my Continuous Integration and DevOps-for-SSIS thoughts, and then building tools and designing best practices and patterns to support the latest version of my vision.

    I don’t expect to stop.

    Ever.

    “Is the SSIS Design Patterns and Biml: A Day of Intelligent Data Integration Precon Just One Big Commercial for the DILM Suite, Andy?”

    Goodness no.

    In the first part I’m going to share everything I know about using what’s-in-the-box to deliver enterprise-class data integration with SSIS – some of which Kent Bradshaw and I covered in the 3-part webinar series titled SSIS Academy: Using the SSIS Catalog (we stayed “in the box” for these three webinars). In the second part I’ll point out some gaps in the OOTB solutions and demonstrate some ways to close them. Examples of some (not all) solutions are free DILM Suite tools.

    Conclusion

    I hope to see you at SQL Saturday Dublin 17 Jun! If you’re interested in learning more about DevOps and Data Integration Lifecycle Management, I also hope to see you at the SSIS Design Patterns and Biml: A Day of Intelligent Data Integration precon.

    You might enjoy engaging Enterprise Data & Analytics consultants because we like helping teams learn more about DevOps and CI for SSIS.

    :{>

    Learn More:
    An Example of Data Integration Lifecycle Management with SSIS, Part 4
    The Recordings for SSIS Academy: Using the SSIS Catalog are Available
    Save Time and Improve SSIS Quality with Biml
    SSIS Catalog Compare v2.0 Launch Event 7 Mar 2017!

    Related Training:
    From Zero to Biml - 19-22 Jun 2017, London
    IESSIS1: Immersion Event on Learning SQL Server Integration Services – April 2017, Chicago 
    IESSIS2: Immersion Event on Advanced Integration Services – Oct 2017, Chicago

This Blog

Syndication

News

My Latest Book:

Community Awards



Friend of Red Gate

Contact Me

Archives

Privacy Statement