Building a FixedWidthFormatter using Generics and Expressions

As part of the many integrations I work on at my job, one of them involved exporting some data cumulatively on a daily schedule. The output of this the cumulative data was defined as a fixed-width export. Many times, data formatting falls into other formats such as csv, xml, json, etc… but fixed-width was different.

Here is an example of a fixed-width export I was working with. Given this representation of the data:

public class DataExport  
{  
    public string FirstName { get; set; }  
    public string LastName { get; set; }  
    public string Gender { get; set; }  
}  

we could expect output that looks like this:

dave jones m  
joe schmoe m  
betty ann f  

Simple. So we need to basically tell the FixedWidthFormatter:

  • how many spaces we require for each property
  • assign the values for each property in the class being mapped.

At first, it might be relatively simple to just jump in and start coding something like this:

public class FixedWidthFormatter  
{  
    private readonly Dictionary<string, FixedWidth> propertyPositions = new Dictionary<string, FixedWidth>();

    public void SetWidthFor(string fieldName, FixedWidth fixedWidth)  
    {  
        propertyPositions.Add(fieldName, fixedWidth);  
    }

    public string Format(IEnumerable<DataExport> collectionToFormat)  
    {  
        //start doing the formatting work based on the collection passed in and the fixed widths for each property in DataExport  
    }  
}  

Here is the supporting FixedWidth class (note this class is immutable):

public class FixedWidth  
{  
    public FixedWidth(int from, int to)  
    {  
        From = from;  
        To = to;  
    }

    public int From { get; private set; }  
    public int To { get; private set; }  
}  

Consuming the class’s API would look like this:

var dataExport = new List<DataExport>;  
{  
    new DataExport { FirstName = "dave", LastName = "jones", Gender = "m"},  
    new DataExport { FirstName = "joe", LastName = "schmoe", Gender = "m"},  
    new DataExport { FirstName = "betty", LastName = "ann", Gender = "f"}  
};

var fixedWidthFormatter = new BadFixedWidthFormatter();  
fixedWidthFormatter.SetWidthFor("FirstName", new FixedWidth(1, 10));  
fixedWidthFormatter.SetWidthFor("LastName", new FixedWidth(11, 20));  
fixedWidthFormatter.SetWidthFor("Gender", new FixedWidth(21, 25));

var results = fixedWidthFormatter.Format(dataExport);  

We’re creating a new List, populating the list, then setting the fixed widths on the properties using the property name and the FixedWidth class. When we’re done setting those widths, we give the FixedWidthFormatter the List and it formats the data in the list corresponding to the widths set for each property.

There are a couple things wrong with this approach:

  • we’re using strings for the property names of DataExport to which we’re assigning a fixed width. - because we’re relying on strings here, if any of the property names on DataExport changes, we’re not going to know we have a problem until run-time. This is a HUGE type-safety issue.
  • we’re passing in a IEnumerable to the Format method. - this limits us to using this class for only doing fixed-width formatting for the DataExport class. Since FixedWidthFormatter is responsible for doing something that can be used for any type of DTO-based data definitions, this class should be able to provide fixed width formatting to any type that is passed to it.

Time to refactor.


Generics and Expressions

The first step we want to take a look at is getting rid of those “magic strings” being passed to the SetWidthFor method. We can do this with a one-two punch of generics and expressions.

First, let’s change the class definition to look like this:

public class FixedWidthFormatter<T> where T : class  
{…}  

Using the generic T we’re allowing the FixedWidthFormatter to format any type T that is a class we provide to it upon instantiation. Based on this one change, now our API looks like this:

var fixedWidthFormatter = new FixedWidthFormatter<DataExport>();  

We’re telling the FixedWidthFormatter that we want to create a new formatter for the DataExport type.

Now that we have the type of the FixedWidthFormatter defined as a formatter for DataExport, we can using an expression to add type-safety to the SetWidthFor method. Here is the code:

public void SetWidthFor<TProperty>(Expression<Func<T, TProperty>> expression, FixedWidth fixedWidth)  
{  
    var memberExpression = expression.Body as MemberExpression;  
    propertyPositions.Add(memberExpression.Member.Name, fixedWidth);  
}  

Not only are we using T in our expression, but we’ve added another generic type, TProperty, that is required when invoking this method. By putting “Expression> expression” as the first argument to the SetWidthFor method, we now enforce type-safety based on the type T (DataExport), and then a lambda specifying the name of the property we want to assign the fixed width to.

Here is an example of the API with the new method signature:

var fixedWidthFormatter = new FixedWidthFormatter<DataExport>();  
fixedWidthFormatter.SetWidthFor(x => x.FirstName, new FixedWidth(1, 10));  

Much nicer.

Note the lambda (x => x.FirstName) being assigned as the first argument. That’s our expression at work for us. Now we can specify the property name with type-safety, enforced by the T we’re using in our instantiation of the FixedWidthFormatter and the expression using TProperty for the property names.

We have one last thing to discuss, and that’s the Format method taking an IEnumerable. Earlier we said that taking in a specific type of list for this method is limiting the functional usage of the FixedWidthFormatter across more than just the DataExport class.

We’ve already solved this problem when we specified type T for the FixedWidthFormatter class:

public string Format(IEnumerable<T> collectionToFormat)  
{…}  

The collection we now give to the Format method is being constrained by the type T we use when instantiating the FixedWidthFormatter.

Here is the finished code for FixedWidthFormatter:

public class FixedWidthFormatter<T> where T : class  
{  
    private readonly Dictionary<string, FixedWidth> propertyPositions = 
        new Dictionary<string, FixedWidth>();

    public void SetWidthFor<TProperty>(Expression<Func<T, TProperty>> expression, 
        FixedWidth fixedWidth)  
    {  
        var memberExpression = expression.Body as MemberExpression;  
        propertyPositions.Add(memberExpression.Member.Name, fixedWidth);  
    }

    public string Format(IEnumerable<T> collectionToFormat)  
    {  
        var stringBuilder = new StringBuilder();  
        foreach (var data in collectionToFormat)  
        {  
            foreach (var propertyInfo in data.GetType().GetProperties())  
            {  
                FixedWidth fixedWidth;  
                propertyPositions.TryGetValue(propertyInfo.Name, out fixedWidth);  
                AppendDataToAssignedPosition(stringBuilder, fixedWidth,
                    propertyInfo.GetValue(data) == null ? string.Empty : 
                    propertyInfo.GetValue(data).ToString());  
           }  
           stringBuilder.Append(Environment.NewLine);  
       }  
       return stringBuilder.ToString();  
    }

    private static void AppendDataToAssignedPosition(StringBuilder stringBuilder, 
        FixedWidth fixedWidth, string data)  
    {  
        stringBuilder.Append(data);

        var availableSpaceForData = ((fixedWidth.To – fixedWidth.From) + 1);  
        var whiteSpaceLeftOver = availableSpaceForData – data.Length;  
        stringBuilder.Append(‘ ‘, whiteSpaceLeftOver);  
    }  
}  

We’re internally using a Dictionary to store each property name and its corresponding FixedWidth added via the SetWidthFor method. Note the code to pull the property name off of the expression:

var memberExpression = expression.Body as MemberExpression;  
propertyPositions.Add(memberExpression.Member.Name, fixedWidth);  

When we call Format, we’re using some reflection to start iterating through our collection, and getting values in each property for each row provided in List. That property value is written to a StringBuilder. Most importantly, the “left over” white space is figured out based on the size of the data for each value and the amount of space allotted for that via its FixedWidth value.


In Closing

That’s it! There is a LOT of room for improvement here, mainly that you can give the FixedWidthFormatter a class with ANY type of properties on it, including custom types. The output for Customer types, Enums, etc… will be incorrect. Right now, this implementation will only support primitive types as properties, but given the fact that this type of cumulative data export will be working with very simple properties for its data definition, and that the export we’re producing is basically a text file, this is a good start.

You can find the full source code over at my GitHub account.

I hope you enjoyed the post.

Michael McCarthy

Read more posts by this author.