Saturday, March 22, 2014

[dvxmfpfy] Breaking a list into chunks

Breaking a list into chunks is an unfold:

list_chunks :: Int -> [a] -> [[a]];
list_chunks chunk_size = unfoldr $ \l -> if null l then Nothing else Just $ splitAt chunk_size l;

The output tuple of splitAt is conveniently just the right format for input to unfoldr. Is there a more elegant way to write if null l then Nothing else Just ...? The function listToMaybe doesn't quite work, as it assumes a single-element list.

> list_chunks 3 "hello world"
["hel","lo ","wor","ld"]

Here is a ByteString version:

import qualified Data.ByteString.Lazy as BS;
bytestring_blocks :: Int64 -> BS.ByteString -> [BS.ByteString];
bytestring_blocks block_size = unfoldr $ \b -> if BS.null b then Nothing else Just $ BS.splitAt block_size b;

After getting recursion beat into our skulls in introductory computer science courses, there's a feeling in functional programming to eschew explicit recursion in favor of higher-order functions like unfoldr which capture the recursive computational structure.

5 comments :

Unknown said...

How about using guard from Control.Monad? This turns your example into:

import Control.Monad (guard)
import qualified Data.ByteString.Lazy as L
import Data.Int (Int64)
import Data.List (unfoldr)
bytestring_blocks :: Int64 -> L.ByteString -> [L.ByteString]
bytestring_blocks size = unfoldr $ \bs -> guard (not (L.null bs)) >> return (L.splitAt size bs)

Unknown said...

I'm not sure whether there is a "more elegant" way to write the condition, but here's one way to avoid writing it:

list_chunks chunk_size = takeWhile (not . null) . unfoldr (Just . splitAt chunk_size)

Anonymous said...

A search on types via hoogle revealed that "find" can be used to convert to maybe:

list_chunks chunk_size = unfoldr $ \l ->
fmap (splitAt chunk_size) $ maybeEmpty l

maybeEmpty :: [a] -> Maybe [a]
maybeEmpty l = find (not . null) [l]

but it still doesn't feel elegant or right.

alex404 said...

Don't write off recursion quite so easily. As far as I can tell, this version runs about twice as quickly:


breakEvery :: Int -> [x] -> [[x]]
breakEvery _ [] = []
breakEvery n xs =
take n xs : breakEvery n (drop n xs)


Tail recursion is still tail recursion, and will perform very nicely. Especially in cases where you taking one recursive structure and turning it into another recursive structure (see: tail recursion modulo cons) it's very easy to write nice versions of these algorithms.

alex404 said...

And in this case in particular, you'll avoid unnecessary handling of Maybe types.