Tuesday 27 December 2011

SSIS Control Flow and Data Flow

As I think that In SSIS we must have some solid concept regarding Control Flow and Data flow before going to any complicated scenario. I try to collects some facts related to it, it may be help you but I always prefer to MSDN to get better guideline.

Control flow deals with orderly processing of tasks, which are individual, isolated units of work that perform a specific action ending with a finite outcome (such that can be evaluated as either Success, Failure, or Completion). While their sequence can be customized by linking them into arbitrary arrangements with precedence constraints and grouping them together or repeating their execution in a loop with the help of containers, a subsequent task does not initiate unless its predecessor has completed.

Data flow, on the other hand, handles its processing responsibilities by employing the pipeline paradigm, carrying data record by record (or rather, to be more accurate, memory buffer by memory buffer) from its source to a destination and modifying it in transit by applying transformations. (There are exceptions to this rule, since some of them, such as Sort or Aggregate require the ability to view the entire data set before handing it over to their downstream counterparts). Note that this does not imply that tasks cannot be executed in parallel, but rather that if they do, their actions are not coordinated (unlike processing of data flow components that are part of the same data stream). Another distinction between them is the absence of a mechanism that would allow direct transfer of data between individual control flow tasks. On the other hand, data flow lacks nesting capabilities provided by containers.

These two SQL Server Integration Services features are implemented in the form of two tabs (bearing their respective names) of the Designer interface in the Business Intelligence Development Studio. The control flow portion of a package is constructed by populating the area exposed by the first of these tabs, typically by dragging tasks and containers delivering desired functionality from the Toolbox. The same methodology is applied when adding data sources, destinations, and transformation to the area exposed by the Data Flow tab (with Toolbox adjusting its content depending on the context).

Summary at a glance

Control Flow:

 

  • Process is the key:  precedence constraints control the project flow based on task completion, success or failure
  • Task 1 needs to complete before task 2 begins
  • Smallest unit of the control flow is a task
  • Control flow does not move data from task to task
  • Tasks are run in series if connected with precedence or in parallel
  • Package control flow is made up of containers and tasks connected with precedence constraints to control package flow

Data Flow:

  • Streaming
  • Unlink control flow, multiple components can process data at the same time
  • Smallest unit of the data flow is a component
  • Data flows move data, but are also tasks in the control flow, as such, their success or failure effects how your control flow operates
  • Data is moved and manipulated through transformations
  • Data is passed between each component in the data flow
  • Data flow is made up of source(s), transformations, and destinations.

To be continued…

Posted By: MR. JOYDEEP DAS

 

2 comments:

  1. Excellent Post.
    Waiting for more...........

    ReplyDelete
  2. Good One.
    Pls give some example which will better to visualize.

    Expecting more......

    ReplyDelete