In this blog post I want to explain something weird that happened in a previous SQL Server Integration Services (SSIS) project. A lot of different SSIS packages had to be developed. Often very similar code was used within the same package. So we just copied the Data Flows and changed the things that needed to be changed. After checking if everything worked we looked at the output of the data we found out that there were problems. Let me illustrate with an example.
We have the following simple dataflow called ‘DF | A’ with
1. Take some data (10 row)
2. Add a derived column ‘Result’
3. This column will be filled with the letter ‘A’ within the script component ‘SC | Fill Result Column with A’
4. And finally we will output it to a Derived Column
5. A Data Viewer shows the results
When we run this, we get the following result as expected
Next we add another Data flow to the package named ‘DF | B’. This on does the same thing as the dataflow describe above. The only difference is that the ‘Result’ column will be filled with the letter ‘B’. So we just copy the Data Flow and change the code of the script component
Now if we run the query and look at the Data Viewers of both packages we get the following results.
This is strange, we see that for the result of Data Flow ‘DF | B’ we get the value ‘A’. But we set it explicitly to the value ‘B’. Let’s find out how this is possible.
I searched the Internet but I couldn’t find any causes for this problem. Disappointed and fed up with it, I decided to create the second script component from scratch which solved the issue. But we still didn’t know what the real problem was. After searching and trying every option in the SSDT toolkit, I finally turned to the XML of the package. I investigated the XML section where the Script Component of Data Flow ‘DF | B’ is defined. There I found the following piece of code.
<property dataType="System.String" description="Specifies the name of the Microsoft Visual Studio Tools for Applications project. Project names must be unique within a package." name="VSTAProjectName" typeConverter="NOTBROWSABLE"> SC_0b5003f709304e36874a827f76309177 </property>
In this section the name of the assembly is defined. The text in attribute description I found very interesting ‘Project names must be unique within a package’. This made me think what if the Script Component in Data Flow ‘DF | A’ had the same name ‘SC_0b5003f709304e36874a827f76309177’Yep, I found out that it did. SSIS copied everything even the name of the script component solution.
At runtime SSIS will build these 2 assemblies. When the assembly of the Script Component in Data Flow ‘DF | B’ is built, it is overwritten with the assembly of the script component in Data Flow ‘DF | A’. That is why we see in both result set the value ‘A’ in the ‘Result’ Column.
Now how can we fix this problem? We could change the assembly name but then we must change it everywhere in the solution of that script component. Secondly we could rewrite the script component from scratch, but who wants to do that if you code is several pages long. After some searching I found a real easy solution to the problem:
1. Open a second package.
2. Copy/Paste one of the 2 Script Components to this new package
3. Copy/Paste it back to the original location in you first package.
4. Replace the old Script Component with the new on.
5. Open the Script Component Editor
6. Click on ‘Edit Script’
7. Close it
8. click on ‘OK’
This way SSIS will have changed the assembly names for you. This time when you run the package again you get the results you want.
Thanks for reading my post. I hope it will be the solution for some of you experiencing this problem. I have a few other pitfalls in SSIS that I want to discus in future post.