Archive for October, 2009

Reading Packed Decimals with SSIS

Have you ever got files to import from a Mainframe ? Chances are that the file is in some ebcdic encoding, and could contain something called Packed Decimals.
(also known as COMP-3 in Cobol)

You can find an explanation of what a packed decimal is here.
Basically it’s a compression method for decimals.

Let’s take for example the string “esdfsm”, this takes up 6 characters in a text file.
But when you convert this to hex you get “65 73 64 66 73 6D”.
Now we have 11 digits, the last character in the hex representation tells you if it’s positive or negative. An F means positive and the D stands for negative.
There is no place to store the decimal separator, but if you know what the original type was, you can just insert it in the right place.

If this were a decimal(11,2) you would get –657.364.667,36

Note: The ebcdic encoding often just shows weird unreadable characters, which need to be encoded from ebcdic to unicode first before you can read them or insert them in SQL Server. (SQL Server does have some ebcdic encoding support, but not for ebcdic(500) which is a common ebcdic International encoding.)
You can’t encode the whole file though because the packed decimals will lose their meaning, these need to be calculated from their binary form.

Reading such a file with SSIS can be done using a Script Component.
Just follow these steps :

Note: This example is based on a fixed length file, you could have a CSV file and would need to change the code to your needs….
Also, the packed decimal will be transformed to a right aligned string.
(I needed that for my situation, so as with the csv example, you need to change the code if you need a slightly different representation.)

1. Add a script component to a data flow and choose “Source” when asked for the Script Component Type

image

2. Add the appropriate output name and column names and types. I’m exporting everything as a Unicode string in this example. The packed decimal is the SaleAmount field which is 13 characters wide. (6 bytes becomes 12 hex, containing 11 digits and a sign. +1 for the decimal separator).

image

3. Go to the script tab, press Edit Script… Add the following code. (Replacing the existing code)

using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using System.IO;
using System.Text;

[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{

    public override void PreExecute()
    {
        base.PreExecute();
        /*
          Add your code here for preprocessing or remove if not needed
        */
    }

    public override void PostExecute()
    {
        base.PostExecute();
        /*
          Add your code here for postprocessing or remove if not needed
          You can set read/write variables here, for example:
          Variables.MyIntVar = 100
        */
    }

    public override void CreateNewOutputRows()
    {
        Encoding ebcdic = Encoding.GetEncoding(500);
        Encoding unicode = Encoding.Unicode;

        using (BinaryReader b = new BinaryReader(File.Open("c:\\ebcdicfile.dat", FileMode.Open)))
        {
            byte[] line;
            int pos = 0;
            int length = (int)b.BaseStream.Length;

            while (pos < length)
            {
                line = b.ReadBytes(45); // Make sure to specify the exact length of 1 line in the file here !
                pos += 45;

                MyDataBuffer.AddRow();

                MyDataBuffer.CustNo = unicode.GetString(Encoding.Convert(ebcdic, unicode, line, 0, 4));
                MyDataBuffer.CustName = unicode.GetString(Encoding.Convert(ebcdic, unicode, line, 4, 35));
                MyDataBuffer.SaleAmount = UnPack(line, 39, 6, 2);
            }
        }
    }

    public string UnPack(byte[] Data, int Start, int Length, int Precision)
    {
        string unpacked = "";
        string signed = " ";
        int pos = 0;

        while (pos < Length)
        {
            unpacked += Data[Start + pos].ToString("x2");
            pos += 1;
        }

        if (unpacked.EndsWith("d"))
            signed = "-";

        unpacked = unpacked.Remove(unpacked.Length - 1);
        unpacked = unpacked.TrimStart('0');

        if (unpacked == "")
            unpacked = "0";

        if (Precision > 0)
            if (unpacked.Length > Precision)
                unpacked = unpacked.Insert(unpacked.Length - Precision, ",");
            else
                unpacked = "0," + unpacked;

        unpacked = signed + unpacked;

        if (Precision > 0)
            unpacked = unpacked.PadRight((unpacked.Length + Precision) - (unpacked.Length - (unpacked.IndexOf(",") + 1)), '0');

        if (Precision == 0)
            unpacked = unpacked.PadLeft(Length * 2);
        else
            unpacked = unpacked.PadLeft(1 + Length * 2);

        return unpacked;
    }
}

 

4. Change the appropriate code in the CreateNewOutputRows() function to reflect your file specification.

Change : File.Open(“c:\\ebcdifile.dat” to reflect the correct file location
Change : b.ReadBytes(45); and pos += 45; to reflect the length of a line in your file.
Change : MyDataBuffer to reflect the output name you have chosen (SSIS adds “Buffer” to the end)
Change: CustNo, CustName & SaleAmount to your own fieldnames.
Change: (ebcdic, unicode, line, 0, 4) etc. where the last values represent the starting position in the file and the length
Change: UnPack(line, 39, 6,2) etc. where the 3 values represent the starting position in the file, the length in the file and the desired number of decimals.

5. Save, close the script window and press OK.
Now all that is left to do is read the data into the database with an OLE DB Destination.

image

SQLCAT team at Dutch PASS Chapter – Swimsuit Edition

Our Dutch chapter was honoured by a visit from the SQLCAT team. Lubor Kollar and Thomas Kejser from the SQLCAT team even brought Sameet Agarwal and Amit Shukla along who are from the Core Engine team for SQL Server.

IMG_6927

IMG_6928

 

 

 

 

 

 

 

The evening was hosted by Henk van der Valk who works for Unisys. Our chapter meets every month at various locations, we depend on the generosity of company’s who sponsor our user group meetings. Unisys has become a true “friend of PASS” in the past years having sponsored a number of meetings already, largely due to fellow member Henk who runs the Performance Test Centre of Unisys in the Netherlands.

The setup of this evening was pretty simple but sure did the trick. Lubor introduced Thomas, Amit, Sameet, himself and explained what the SQLCAT team is and does. After that Sameet explained what his Core Engine team is and does. The rest was up to us, we were free to ask questions after that. Following a silence of about 20 seconds was a 2 hour Q&A session. The questions were quite literally about almost everything. Complaints, feature requests or scalability / maintenance tips and tricks. You could ask pretty much anything and you would either get an answer or notice that Lubor was making lot’s of notes.

The cool part is that you can do this too. The SQLCAT team members can be found at many conferences like SQLBits, the European PASS Conference (more on the 2010 edition soon) and of course the PASS Summit. And let’s not forget to mention that they share a wealth of knowledge through white papers etc. via their website www.sqlcat.com !

Lubor showed proof at the end of the evening that you can literaly run into CAT team members anywhere by showing a picture of Kenneth Willhelmson, himself and yours truly from the SQL Server Open World conference in Lalandia, Denmark.

DSCN0407

Or maybe showing this picture on the screen in front of 50 chapter members was a trick to get me to be quiet. Like Sameet calling me a trouble maker was not a strong enough hint. Well, they did say that you could say anything, even complain about something or request features :-)

It was all in good fun of course and I would like to thank Lubor and Thomas from the CAT team, Sameet and Amit from the Core Engine Team, Henk van der Valk from Unisys and of course everyone who showed up from our PASS Chapter for this excellent Q&A session !