The total path length sums up all vertex depths. As such, it is related to the Sackin index, i.e., the total external path length, and the total internal path length.
is defined for arbitrary trees and it is an imbalance index, i.e. for a fixed
it increases with decreasing balance of the tree. It can be calculated using the function totPathLen from our R package treebalance.
Definition (e.g. Dobrow, Fill, 1999, Takacs, 1992, Takacs, 1994): The total path length of a tree
is defined as
,
where denotes the Sackin index, i.e., the total external path length, and
denotes the total internal path length.
Computation time: For every tree , the total path length
can be computed in time
(see Fischer et al., 2023, Proposition 23.5).
Recursiveness: The total path length is a recursive tree shape statistic. More information on the recursion can be found in Fischer et al., 2023, Proposition 23.6.
Locality: For arbitrary trees the total path length is not local. For binary trees
it is local. (see Fischer et al., 2023, Proposition 23.7).
Maximal value and (number of) trees with maximal value on and
:
An upper bound for the total path length for arbitrary and binary trees trees
or
with
leaves which is tight for all
as well as the (number of) corresponding maximal (binary) trees has already been established (see Fischer et al., 2023, Theorem 23.3).
Minimal value and (number of) minimal trees on : A lower bound for the total path length for trees
which is tight for all
as well as the (number of) corresponding minimal trees have already been established (see Fischer et al., 2023, Proposition 23.8).
Minimal value and (number of) minimal trees on : A lower bound for the total path length for binary trees
which is tight for all
as well as the (number of) corresponding minimal trees have already been established (see Fischer et al., 2023, Theorem 23.4).
Expected value and variance under the Yule model: Open problem.
Expected value and variance under the uniform model: Open problem.