Graph - Family Tree Algorithm

De openkb
Aller à : Navigation, rechercher

Sommaire

Questions

I m working on putting together a problem set for an intro-level CS course and came up with a question that, on the surface, seems very simple:

You are given a list of people with the names of their parents, their birth dates, and their death dates. You are interested in finding out who, at some point in their lifetime, was a parent, a grandparent, a great-grandparent, etc. Devise an algorithm to label each person with this information as an integer (0 means the person never had a child, 1 means that the person was a parent, 2 means that the person was a grandparent, etc.)

For simplicity, you can assume that the family graph is a DAG whose undirected version is a tree.

The interesting challenge here is that you can t just look at the shape of the tree to determine this information. For example, I have 8 great-great-grandparents, but since none of them were alive when I was born, in their lifetimes none of them were great-great-grandparents.

The best algorithm I can come up with for this problem runs in time O(n2), where n is the number of people. The idea is simple - start a DFS from each person, finding the furthest descendant down in the family tree that was born before that person s death date. However, I m pretty sure that this is not the optimal solution to the problem. For example, if the graph is just two parents and their n children, then the problem can be solved trivially in O(n). What I m hoping for is some algorithm that is either beats O(n2) or whose runtime is parameterized over the shape of the graph that makes it fast for wide graphs with a graceful degradation to O(n2) in the worst-case.

Answers

I thought of this this morning, then found that @Alexey Kukanov had similar thoughts. But mine is more fleshed out and has some more optimization, so I ll post it anyways.

This algorithm is O(n * (1 + generations)), and will work for any dataset. For realistic data this is O(n).

    • Run through all records and generate objects representing people which include date of birth, links to parents, and links to children, and several more uninitialized fields. (Time of last death between self and ancestors, and an array of dates that they had 0, 1, 2, ... surviving generations.)
    • Go through all people and recursively find and store the time of last death. If you call the person again, return the memoized record. For each person you can encounter the person (needing to calculate it), and can generate 2 more calls to each parent the first time you calculate it. This gives a total of O(n) work to initialize this data.
    • Go through all people and recursively generate a record of when they first added a generation. These records only need go to the maximum of when the person or their last ancestor died. It is O(1) to calculate when you had 0 generations. Then for each recursive call to a child you need to do O(generations) work to merge that child s data in to yours. Each person gets called when you encounter them in the data structure, and can be called once from each parent for O(n) calls and total expense O(n * (generations + 1)).
    • Go through all people and figure out how many generations were alive at their death. This is again O(n * (generations + 1)) if implemented with a linear scan.

The sum total of all of these operations is O(n * (generations + 1)).

For realistic data sets, this will be O(n) with a fairly small constant.

Source

License : cc by-sa 3.0

http://stackoverflow.com/questions/6102125/family-tree-algorithm

Related

Outils personnels
Espaces de noms

Variantes
Actions
Navigation
Outils