Persistence Diagram Vectorization

PersistenceDiagrams.PersistenceImageType
PersistenceImage

PersistenceImage provides a vectorization method for persistence diagrams. Each point in the diagram is first transformed into birth, persistence coordinates. Then, it is weighted by a weighting function and widened by a distribution (default: gaussian with σ=1). Once all the points are transformed, their distributions are summed together and discretized into an image.

The weighting ensures points near the diagonal have a small contribution. This ensures this representation of the diagram is stable.

Once a PersistenceImage is constructed (see below), it can called like a function to transform a diagram to an image.

Infinite intervals in the diagram are ignored.

Constructors

PersistenceImage(ylims, xlims; kwargs...)

Create an image ranging from ylims[1] to ylims[2] in the $y$ direction and equivalently for the $x$ direction.

PersistenceImage(diagrams; zero_start=true, margin=0.1, kwargs...)

Learn the $x$ and $y$ ranges from diagrams, ensuring all diagrams will fully fit in the image. Limits are increased by the margin. If zero_start is true, set the minimum y value to 0. If all intervals in diagrams have the same birth (e.g. in the zeroth dimension), a single column image is produced.

Keyword Arguments

  • size: integer or tuple of two integers. Determines the size of the array containing the image. Defaults to 5.

  • distribution: A function or callable object used to smear each interval in diagram. Has to be callable with two Float64s as input and should return a Float64. Defaults to a normal distribution.

  • sigma: The width of the normal distribution mentioned above. Only applicable when distribution is unset. Defaults to twice the size of each pixel.

  • weight: A function or callable object used as the weighting function. Has to be callable with two Float64s as input and should return a Float64. Should equal 0.0 for x=0, but this is not enforced. Defaults to function that is zero at $y=0$, and increases linearly to 1 until slope_end is reached.

  • slope_end: the relative $y$ value at which the default weight function stops increasing. Defaults to 1.0.

Example

julia> diag_1 = PersistenceDiagram([(0, 1), (0, 1.5), (1, 2)]);

julia> diag_2 = PersistenceDiagram([(1, 2), (1, 1.5)]);

julia> image = PersistenceImage([diag_1, diag_2])
5×5 PersistenceImage(
  distribution = PersistenceDiagrams.Binormal(0.5499999999999999),
  weight = PersistenceDiagrams.DefaultWeightingFunction(1.65),
)

julia> image(diag_1)
5×5 Matrix{Float64}:
 0.156707  0.164263  0.160452  0.149968  0.133353
 0.344223  0.355089  0.338991  0.308795  0.268592
 0.571181  0.577527  0.535069  0.47036   0.396099
 0.723147  0.714873  0.639138  0.536823  0.432264
 0.700791  0.677237  0.582904  0.46433   0.352962

Reference

Adams, H., Emerson, T., Kirby, M., Neville, R., Peterson, C., Shipman, P., ... & Ziegelmeier, L. (2017). Persistence images: A stable vector representation of persistent homology. The Journal of Machine Learning Research, 18(1), 218-252.

source
PersistenceDiagrams.PersistenceCurveType
PersistenceCurve

Persistence curves offer a general way to transform a persistence diagram into a vector of numbers.

This is done by first splitting the time domain into buckets. Then the intervals contained in the bucket are collected and transformed by applying fun to each of them. The result is then summarized with the stat function. If an interval is only parially contained in a bucket, it is counted partially.

Once a PersistenceCurve is constructed (see below), it can be called to convert a persistence diagram to a vector of floats.

Constructors

  • PersistenceCurve(fun, stat, start, stop; length=10, integrate=true, normalize=false): length buckets with the first strating on t_start and the last ending on t_end.
  • PersistenceCurve(fun, stat, diagrams; length=10, integreate=true, normalize=false): learn the start and stop parameters from a collection of persistence diagrams.

Arguments

  • length: the length of the output. Defaults to 10.
  • fun: the function applied to each interval. Must have the following signature. fun(::AbstractPersistenceInterval, ::PersistenceDiagram, time)::T
  • stat: the summary function applied the results of fun. Must have the following signature. stat(::Vector{T})::Float64
  • normalize: if set to true, normalize the result. Does not work for time-dependent funs. Defaults to false. Normalization is performed by dividing all values by stat(fun.(diag)).
  • integrate: if set to true, the amount of overlap between an interval and a bucket is considered. This prevents missing very small bars, but does not work correctly for curves with time-dependent funs where stat is a selection function (such as landscapes). If set to false, the curve is simply sampled at midpoints of buckets. Defaults to true.

Call

(::PersistenceCurve)(diagram; normalize, integrate)

Transforms a diagram. normalize and integrate override defaults set in constructor.

Example

julia> diagram = PersistenceDiagram([(0, 1), (0.5, 1), (0.5, 0.6), (1, 1.5), (0.5, Inf)]);

julia> curve = BettiCurve(0, 2, length = 4)
PersistenceCurve(always_one, sum, 0.0, 2.0; length=4, normalize=false, integrate=true)

julia> curve(diagram)
4-element Vector{Float64}:
 1.0
 3.2
 2.0
 1.0

See Also

The following are equivalent to PersistenceCurve with appropriately selected fun and stat arguments.

More options listed in Table 1 on page 9 of reference.

Reference

Chung, Y. M., & Lawson, A. (2019). Persistence curves: A canonical framework for summarizing persistence diagrams. arXiv preprint arXiv:1904.07768.

source
PersistenceDiagrams.PDThresholdingFunction
PDThresholding

The persistence diagram thresholding function.

fun((b, d), _, t) = (d - t) * (t - b)
stat = mean

See also

Reference

Chung, Y. M., & Day, S. (2018). Topological fidelity and image thresholding: A persistent homology approach. Journal of Mathematical Imaging and Vision, 60(7), 1167-1179.

source