Haskell - Memoization issue on a Tree-like datastructure

De openkb
Aller à : Navigation, rechercher

Sommaire

Questions

  EDIT:    while I m still interested in an answer on the problems the execution faces in this case, it appears that it was indeed related to strictness since a -O fixes the execution and the program can handle the tree really quickly.

https://projecteuler.net/ https://projecteuler.net/

I already solved it using simple lists and dynamic programming.

I d like to solve it now using a tree datastructure (well, where a Node can have two parents so it s not really a tree). I thought I d use a simple tree but would take care to craft it so that Nodes are shared when appropriate:

data Tree a = Leaf a | Node a (Tree a) (Tree a) deriving (Show, Eq)

Solving the problem is then just a matter of going through the tree recursively:

calculate :: (Ord a, Num a) => Tree a => a
calculate (Node v l r) = v + (max (calculate l) (calculate r))
calculate (Leaf v) = v

Obviously this has exponential time complexity though. So I tried to memoize the results with :

calculate :: (Ord a, Num a) => Tree a => a
calculate = memo go
    where go (Node v l r) = v + (max (calculate l) (calculate r))
          go (Leaf v) = v

http://hackage.haskell.org/package/stable-memo-0.2.2/docs/Data-StableMemo.html#v:memo http://hackage.haskell.org/package/stable-memo-0.2.2/docs/Data-StableMemo.html#v:memo

http://felsin9.de/nnis/ghc-vis/ http://felsin9.de/nnis/ghc-vis/

On the sample tree produced by my function as such: lists2tree [[1], [2, 3], [4, 5, 6]], it returns the following correct sharing:

http://public.crydee.eu/sample.png http://public.crydee.eu/sample.png

Here we can see that the node 5 is shared.

Yet it seems that my tree in the actual Euler Problem isn t getting memoized correctly. https://github.com/m09/project-euler/blob/master/61-70/67.lhs https://github.com/m09/project-euler/blob/master/61-70/67.lhs

lists2tree :: [[a]] -> Tree a
lists2tree = head . l2t

l2t :: [[a]] -> [Tree a]
l2t (xs:ys:zss) = l2n xs ts t
    where (t:ts) = l2t (ys:zss)
l2t (x:[])      = l2l x
l2t []          = undefined

l2n :: [a] -> [Tree a] -> Tree a -> [Tree a]
l2n (x:xs) (y:ys) p = Node x p y:l2n xs ys y
l2n []     []     _ = []
l2n _      _      _ = undefined

l2l :: [a] -> [Tree a]
l2l = map (l -> Leaf l)

It basically goes through the list of lists two rows at a time and then creates nodes from bottom to top recursively.

What is wrong with this approach? I thought it might that the program will still produce a complete tree parse in thunks before getting to the leaves and hence before memoizing, avoiding all the benefits of memoization but I m not sure it s the case. If it is, is there a way to fix it?

Answers

This doesn t really address the original question, but I find it is usually easier and more powerful to use explicit memoization.

I chose to store the triangle as a list indexed by a position rather than a tree:

[     ((1,1),3),
 ((2,1),7), ((2,2), 4), 
 ....

Suppose that part of the result has already been memoized in a list of this format. Then computing the answer at a particular coordinate is trivial:

a # i = let Just v = lookup i a in v

compute tree result (x,y) = tree # (x,y) + max (result # (x+1,y)) (result # (x+1,y+1))

Now we must build result. This is also trivial; all we have to do is map compute over all valid indices.

euler67 :: [((Int, Int), Integer)] -> Integer 
euler67 tree = result # (1,1)
  where 
    xMax = maximum $ map (fst . fst) tree 

    result =    [ ((x,y), compute (x,y)) | x <- [1 .. xMax], y <- [1..x] ] 
             ++ [ ((xMax + 1,y),0) | y <- [1..xMax + 1]]

    compute (x,y) = tree # (x,y) + max (result # (x+1,y)) (result # (x+1,y+1))

Computing height of the triangle (xMax) is just getting the maximum x-index. Of course we are assuming that the tree is well formed.

The only remotely complicated part is determining which indices are valid for result. Obviously we need 1 row for every row in the original tree. Row x will have x items. We also add an extra row of zeroes at the bottom - we could handle the base case in a special way in compute but it is probably easier this way.

You ll notice that is is quite slow for the hundred row triangle. This is because lookup is traversing three lists per call to compute. To speed it up I used arrays:

euler67  :: Array (Int, Int) Integer -> Integer 
euler67  tree = result ! (1,1)
  where 
    ((xMin, yMin), (xMax, yMax)) = bounds tree

    result = accumArray (+) 0 ((xMin, yMin), (xMax + 1, yMax + 1)) $
         [ ((x,y), compute (x,y)) | x <- [xMin .. xMax], y <- [yMin..x] ] 
      ++ [ ((xMax + 1,y),0) | y <- [yMin..xMax + 1]]

    compute (x,y) = tree ! (x,y) + max (result ! (x+1,y)) (result ! (x+1,y+1))

Also here is the code I used for reading the files:

readTree  :: String -> IO (Array (Int, Int) Integer)
readTree  path = do
  tree <- readTree path
  let 
    xMax = maximum $ map (fst . fst) tree 
    yMax = maximum $ map (snd . fst) tree 
  return $ array ((1,1), (xMax,yMax)) tree

readTree :: String -> IO [((Int, Int), Integer)]
readTree path = do
  s <- readFile path 
  return $ map f $ concat $ zipWith (
 xs -> zip (repeat n) xs) [1..] $ map (zip [1..] . map read . words) $ lines s
    where 
      f (a, (b, c)) = ((a,b), c)

Source

License : cc by-sa 3.0

http://stackoverflow.com/questions/22600549/memoization-issue-on-a-tree-like-datastructure

Related

Outils personnels
Espaces de noms

Variantes
Actions
Navigation
Outils