Update 16-10-2010 : I’ve updated the code because we found a bug regarding the precision when dealing with small amounts.
(This code has been used in a production system for over a year now, but do use any code you find on the web at your own risk !)
Have you ever got files to import from a Mainframe ? Chances are that the file is in some ebcdic encoding, and could contain something called Packed Decimals.
(also known as COMP-3 in Cobol)
You can find an explanation of what a packed decimal is here.
Basically it’s a compression method for decimals.
Let’s take for example the string “esdfsm”, this takes up 6 characters in a text file.
But when you convert this to hex you get “65 73 64 66 73 6D”.
Now we have 11 digits, the last character in the hex representation tells you if it’s positive or negative. An F means positive and the D stands for negative.
There is no place to store the decimal separator, but if you know what the original type was, you can just insert it in the right place.
If this were a decimal(11,2) you would get –657.364.667,36
Note: The ebcdic encoding often just shows weird unreadable characters, which need to be encoded from ebcdic to unicode first before you can read them or insert them in SQL Server. (SQL Server does have some ebcdic encoding support, but not for ebcdic(500) which is a common ebcdic International encoding.)
You can’t encode the whole file though because the packed decimals will lose their meaning, these need to be calculated from their binary form.
Reading such a file with SSIS can be done using a Script Component.
Just follow these steps :
Note: This example is based on a fixed length file, you could have a CSV file and would need to change the code to your needs….
Also, the packed decimal will be transformed to a right aligned string.
(I needed that for my situation, so as with the csv example, you need to change the code if you need a slightly different representation.)
1. Add a script component to a data flow and choose “Source” when asked for the Script Component Type
2. Add the appropriate output name and column names and types. I’m exporting everything as a Unicode string in this example. The packed decimal is the SaleAmount field which is 13 characters wide. (6 bytes becomes 12 hex, containing 11 digits and a sign. +1 for the decimal separator).
3. Go to the script tab, press Edit Script… Add the following code. (Replacing the existing code)
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;
using System.Text;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
public override void PreExecute()
{
base.PreExecute();
/*
Add your code here for preprocessing or remove if not needed
*/
}
public override void PostExecute()
{
base.PostExecute();
/*
Add your code here for postprocessing or remove if not needed
You can set read/write variables here, for example:
Variables.MyIntVar = 100
*/
}
public override void CreateNewOutputRows()
{
Encoding ebcdic = Encoding.GetEncoding(500);
Encoding unicode = Encoding.Unicode;
using (BinaryReader b = new BinaryReader(File.Open("c:\\ebcdicfile.dat", FileMode.Open)))
{
byte[] line;
int pos = 0;
int length = (int)b.BaseStream.Length;
while (pos < length)
{
line = b.ReadBytes(45); // Make sure to specify the exact length of 1 line in the file here !
pos += 45;
MyDataBuffer.AddRow();
MyDataBuffer.CustNo = unicode.GetString(Encoding.Convert(ebcdic, unicode, line, 0, 4));
MyDataBuffer.CustName = unicode.GetString(Encoding.Convert(ebcdic, unicode, line, 4, 35));
MyDataBuffer.SaleAmount = UnPack(line, 39, 6, 2);
}
}
}
public string UnPack(byte[] Data, int Start, int Length, int Precision)
{
string unpacked = "";
string signed = " ";
int pos = 0;
while (pos < Length)
{
unpacked += Data[Start + pos].ToString("x2");
pos += 1;
}
if (unpacked.EndsWith("d"))
signed = "-";
unpacked = unpacked.Remove(unpacked.Length - 1);
unpacked = unpacked.TrimStart('0');
if (unpacked == "")
unpacked = "0";
if (Precision > 0)
if (unpacked.Length > Precision)
unpacked = unpacked.Insert(unpacked.Length - Precision, ".");
else
unpacked = "0." + unpacked.PadLeft(Precision,'0');
unpacked = signed + unpacked;
if (Precision > 0)
unpacked = unpacked.PadRight((unpacked.Length + Precision) - (unpacked.Length - (unpacked.IndexOf(".") + 1)), '0');
if (Precision == 0)
unpacked = unpacked.PadLeft(Length * 2);
else
unpacked = unpacked.PadLeft(1 + Length * 2);
return unpacked;
}
}
4. Change the appropriate code in the CreateNewOutputRows() function to reflect your file specification.
Change : File.Open(“c:\\ebcdifile.dat” to reflect the correct file location
Change : b.ReadBytes(45); and pos += 45; to reflect the length of a line in your file.
Change : MyDataBuffer to reflect the output name you have chosen (SSIS adds “Buffer” to the end)
Change: CustNo, CustName & SaleAmount to your own fieldnames.
Change: (ebcdic, unicode, line, 0, 4) etc. where the last values represent the starting position in the file and the length
Change: UnPack(line, 39, 6,2) etc. where the 3 values represent the starting position in the file, the length in the file and the desired number of decimals.
5. Save, close the script window and press OK.
Now all that is left to do is read the data into the database with an OLE DB Destination.

Leave a comment