Advent of Code 2022 - Day 4 Bonus: Parsing
In Day 4, I had a strong hunch that the
parseRange function could be improved. For me, parsing tends to be one of the slowest parts of the Advent of Code challenges, and so I'm always looking for ways to build a better intuition for parsing in Haskell.
Here's the original:
parseRangePair :: GenParser Char u (Range, Range) parseRangePair = do r1 <- parseRange _ <- char ',' r2 <- parseRange return (r1, r2) where parseRange = do a <- read <$> many1 digit _ <- char '-' b <- read <$> many1 digit return $ Range a b
Parsec defines parser combinators that can be used to compose powerful grammars. My original code does not use much composition, so maybe I'm doing something wrong!
Use Your Words, Human
One trick that helps in these situations is to try describing what we want in plain language. (This apparently works with computers, too.)
- Two Ranges, separated by a comma
- Each Range is two Ints, separated by a hyphen
Following this, we actually come across the
parseRanges :: GenParser Char u [Range] parseRanges = parseRange `sepBy1` (char ',') where parseRange = ...
While this looks promising, it doesn't compile.
sepBy1 parses a list. Our
Range constructor expects exactly 2 Ints, and our
part1, part2 functions expect exactly 2 ranges. Let's go a different way, but I'll remember
sepBy in later challenges.
Do or Do Not
do notation is resulting in verbose code. We have to bind the separator characters just to discard them, and we need a separate line just to return the final result.
Monads, Applicatives, and Functors are often described as being values wrapped in some abstract 'box'. In our last Day 3 bonus post, our box was functions; we wanted to compose values that still needed a String passed to them. Now we want to compose values that haven't yet been parsed. They are still in the Parser box, if you will.
We can "lift" our combining functions to work on boxed values. To discard the separator in between, we can use the operator (>>). This operator is like bind (>>=) except that it discards the first first result value.
Putting it all together, we have:
parseRangePair :: GenParser Char u (Range, Range) parseRangePair = let parseInt = read <$> many1 digit parseRange = liftA2 Range parseInt (char '-' >> parseInt) in liftA2 (,) parseRange (char ',' >> parseRange)
Not right now, thank you.
parseInt definitely belongs in my Input module, and the internal structure of
parseRangePair could also be useful to reuse. But that's a problem for a future day. Maybe even tomorrow!
Advent of Code 2022 Series
This post is part of a series describing my Haskell solutions to Advent of Code 2022.
Main Post: Day 4: Camp Cleanup