Quantcast
Viewing all articles
Browse latest Browse all 35

Answer by user2023861 for Parsing data from text with repeating blocks

Your methods are too big and are doing too much. ParseDebugListResponse is responsible for finding the backend boundaries, splitting the response into backend objects, determining what each line is, and parsing each line. If you don't know about the Single-Responsibility Principle, you should check it out.

A solution needs to do a few things. Let's look first at how to split all of the input lines into blocks of Backend information:

public class MyNodeCollection : IEnumerable<Backend>{    private readonly List<Backend> _backends;    public MyNodeCollection(List<string> list)    {        this._backends = new List<Backend>(this.FindBoundaries(list)            .Select(s => new Backend(                list                .Skip(s.Item1)                .Take(s.Item2 - s.Item1)                .ToList())));    }    private IEnumerable<Tuple<int, int>> FindBoundaries(List<string> list)    {        return FindStarts(list)            .Zip(FindEnds(list), (start, end) => Tuple.Create(start, end));    }    private IEnumerable<int> FindStarts(List<string> list)    {        return list            .Select((s, i) => new { s, i })            .Where(w => w.s.StartsWith("Backend "))            .Select(ss => ss.i);    }    private IEnumerable<int> FindEnds(List<string> list)    {        return this.FindStarts(list)            .Skip(1)            .Select(s => s)            .Concat(new List<int> { list.Count + 1 });    }    public IEnumerator<Backend> GetEnumerator()    {        return this._backends.GetEnumerator();    }    IEnumerator IEnumerable.GetEnumerator()    {        return this._backends.GetEnumerator();    }}

I've split the functionality into single-line methods making liberal use of Linq extension methods.

  • In FindStarts, all I'm doing is returning the indexes of lines that start with "Backend ". That is the first line of Backend data.
  • FindEnds is slightly more complicated in that I want to return the indexes of the last lines of Backend data. That's either the line before the next "Backed " line, or the last line in the list of strings.
  • In FindBoundaries, I need to combine the start indexes and the end indexes into tuples of indexes. I can do this using the little-used Zip method.
  • Finally, in the constructor I take the tuples of boundary indexes and use them to split the given list into chunks of Backend data.
  • Note that I have MyNodeCollection implement IEnumerable<Backend> so that your testing code will work without modification.

I've changed your Backend class to something that more closely follows the Single-Responsibility Principle. Most of the methods are only one line long. They all have meaningful names.

public class Backend{    public string Name { get; private set; }    public string Status { get; private set; }    public int TotalCount { get; private set; }    public int OkCount { get; private set; }    public float AvgResponse { get; private set; }    public List<bool> GoodIPv4 { get; private set; }    public List<bool> GoodXmit { get; private set; }    public List<bool> GoodRecv { get; private set; }    public List<bool> ErrorRecv { get; private set; }    public List<bool> Happy { get; private set; }    public Backend(List<string> list)    {        this.GoodIPv4 = new List<bool>();        this.GoodXmit = new List<bool>();        this.GoodRecv = new List<bool>();        this.ErrorRecv = new List<bool>();        this.Happy = new List<bool>();        foreach (var line in list)        {            this.ParseLine(line);        }    }    private void ParseLine(string line)    {        if (this.IsAverageResponseTime(line))            this.AvgResponse = this.ParseAverageResponseTime(line);        else if (this.IsErrorRecv(line))            this.ErrorRecv.AddRange(this.ParseGoodErrorHappy(line, 'X'));        else if (this.IsGoodIPv4Count(line))            this.GoodIPv4.AddRange(this.ParseGoodErrorHappy(line, '4'));        else if (this.IsGoodRecv(line))            this.GoodRecv.AddRange(this.ParseGoodErrorHappy(line, 'R'));        else if (this.IsGoodXmit(line))            this.GoodXmit.AddRange(this.ParseGoodErrorHappy(line, 'X'));        else if (this.IsHappy(line))            this.Happy.AddRange(this.ParseGoodErrorHappy(line, 'H'));        else if (this.IsName(line))            this.Name = this.ParseName(line);    }    private string ParseName(string arg)    {        return Regex.Replace(arg, @"Backend (\S*) is .*", "$1");    }    private bool IsName(string arg)    {        return arg.StartsWith("Backend ");    }    private float ParseAverageResponseTime(string arg)    {        string str = Regex.Replace(arg,             @"Average responsetime of good probes: (.*)", "$1");        return float.Parse(str);    }    private bool IsAverageResponseTime(string arg)    {        return arg.StartsWith("Average responsetime of good probes: ");    }    private IEnumerable<bool> ParseGoodErrorHappy(string arg, char ch)    {        return arg.Where((w, i) => i < 64).Select(s => s == ch);    }    private bool IsGoodIPv4Count(string arg)    {        return arg.EndsWith(" Good IPv4");    }    private bool IsGoodXmit(string arg)    {        return arg.EndsWith(" Good Xmit");    }    private bool IsGoodRecv(string arg)    {        return arg.EndsWith(" Good Recv");    }    private bool IsErrorRecv(string arg)    {        return arg.EndsWith(" Error Recv");    }    private bool IsHappy(string arg)    {        return arg.EndsWith(" Happy");    }}
  • The constructor loops through each line and sends it to ParseLine for parsing.
  • ParseLine does most of the work. It uses the rest of the methods to determine what kind of line it's looking at and then parses it.
  • The Is... and the remaining Parse... methods are straight-forward.
  • Take a look at how I use Regexes in ParseName and ParseAverageResponseTime. In your Regex calls, you use matches and groups to pull out the information you need. Instead of that, I use the Replace methods to pull out the required information. This turns what is three or four lines of code into a single line of code.
  • Your ParseHealthFlagsResponse method and my ParseGoodErrorHappy method do nearly the same thing. This illustrates the power of Linq extension methods. I suggest you become familiar with them.

Viewing all articles
Browse latest Browse all 35

Trending Articles