Internal

CoTETE.estimate_TE_from_preprocessed_data — Function

estimate_TE_from_preprocessed_data(
   parameters::CoTETEParameters,
   preprocessed_data::PreprocessedData;
   AIS_only::Bool = false,
)

calculates the TE using the preprocessed data and the given parameters.

Note that this is an implementation of the algorithm described in Box 1 of our paper our paper. Consulting that algorithm and the surrounding text is recommended.

julia> source = sort(1e4*rand(Int(1e4)));

julia> target = sort(1e4*rand(Int(1e4)));

julia> parameters = CoTETE.CoTETEParameters(l_x = 1, l_y = 1);

julia> preprocessed_data = CoTETE.preprocess_event_times(parameters, target, source_events = source);

julia> TE = CoTETE.estimate_TE_from_preprocessed_data(parameters, preprocessed_data);

julia> abs(TE - 0) < 0.05 # For Doctesting purposes
true

source

CoTETE.transform_marginals_to_uniform! — Function

function transform_marginals_to_uniform!(preprocessed_data::PreprocessedData)

Independently transforms each dimension of the history embeddings to be uniformly distributed.

source

CoTETE.transform_marginals_to_uniform_on_cdf! — Function

function transform_marginals_to_uniform_on_cdf!(
    preprocessed_data::PreprocessedData,
    history_embeddings::Array{<:AbstractFloat,2},
)

Helper method reused for transforming both the embeddings sampled at events and those sampled
randomly.

source

CoTETE.make_surrogate! — Function

function make_surrogate!(
    parameters::CoTETEParameters,
    preprocessed_data::PreprocessedData,
    target_events::Array{<:AbstractFloat},
    source_events::Array{<:AbstractFloat};
    conditioning_events::Array{<:AbstractFloat} = Float32[],
)

Edit the source component of preprocessed_data.representation_joint such that it conforms to the null hypothesis of conditional independence.

source

CoTETE.make_AIS_surrogate! — Function

function make_AIS_surrogate!(
    parameters::CoTETEParameters,
    preprocessed_data::PreprocessedData,
    target_events::Array{<:AbstractFloat},
)

Edit preprocessed_data.representation_joint such that it conforms to the null hypothesis of the target histories being independent of events in the target process.

source

CoTETE.construct_sample_points_array — Function

function construct_sample_points_array(
    parameters::CoTETEParameters,
    num_samples::Integer,
    start_timestamp::AbstractFloat,
    end_timestamp::AbstractFloat,
    target_events::Array{<:AbstractFloat}
)

Constructs the array of random sample points according to the chosen method.

source

CoTETE.preprocess_event_times — Function

function preprocess_event_times(
    parameters::CoTETEParameters,
    target_events::Array{<:AbstractFloat};
    source_events::Array{<:AbstractFloat} = Float32[],
    conditioning_events::Array{<:AbstractFloat} = Float32[]
)

Use the raw event times to create the history embeddings and other prerequisites for estimating the TE.

julia> parameters = CoTETE.CoTETEParameters(l_x = 1, l_y = 1);

julia> source = cumsum(ones(5)) .- 0.5; # source is {0.5, 1.5, 2.5, ...}

julia> target = cumsum(ones(5)); # target is {1, 2, 3, ...}

julia> preprocessed_data = CoTETE.preprocess_event_times(parameters, target, source_events = source);

julia> println(preprocessed_data.representation_joint) # All target events will be one unit back, all source events 0.5 units
[1.0 1.0 1.0 1.0; 0.5 0.5 0.5 0.5]

source

CoTETE.make_embeddings_along_observation_time_points — Function

function make_embeddings_along_observation_time_points(
    observation_time_points::Array{<:AbstractFloat},
    start_observation_time_point::Integer,
    num_observation_time_points_to_use::Integer,
    event_time_arrays::Array{<:Array{<:AbstractFloat,1},1},
    embedding_lengths::Array{<:Integer},
)

Constructs a set of embeddings from a set of observation points. The observation points and the raw event times are assumed to be sorted. Also returns the exlcusion windows.

Example

julia> source = cumsum(ones(20)) .- 0.5; # source is {0.5, 1.5, 2.5, ...}

julia> conditional = cumsum(ones(20)) .- 0.25; # conditional is {0.75, 1.75, 2.75, ...}

julia> target = cumsum(ones(20)); # target is {1, 2, 3, ...}

julia> observation_points = cumsum(ones(20)) .- 0.75; # observation points are {0.25, 1.25, 2.25, ...}

julia> CoTETE.make_embeddings_along_observation_time_points(observation_points, 5, 3, [target, source, conditional], [2, 1, 1], true)
([0.25 0.25 0.25; 1.0 1.0 1.0; 0.75 0.75 0.75; 0.5 0.5 0.5], [3.0 4.25;;; 4.0 5.25;;; 5.0 6.25])

source

CoTETE.make_one_embedding — Function

function make_one_embedding(
    observation_time_point::AbstractFloat,
    event_time_arrays::Array{<:Array{<:AbstractFloat, 1}, 1},
    most_recent_event_indices::Array{<:Integer},
    embedding_lengths::Array{<:Integer},
)

Constructs the history embedding from a given point in time. Also returns the timestamp of the earliest event used in the construction of the embedding. This is used for recording the exclusion windows.

Example

julia> source = cumsum(ones(20)) .- 0.5; # source is {0.5, 1.5, 2.5, ...}

julia> conditional = cumsum(ones(20)) .- 0.25; # conditional is {0.75, 1.75, 2.75, ...}

julia> target = cumsum(ones(20)); # target is {1, 2, 3, ...}

julia> CoTETE.make_one_embedding(5.25, [target, source, conditional], [5, 5, 5], [2, 3, 1])
(Any[0.25, 1.0, 0.75, 1.0, 1.0, 0.5], 2.5)

source

CoTETE.PreprocessedData — Type

representation_joint::Array{<:AbstractFloat, 2}
exclusion_windows::Array{<:AbstractFloat, 3}
sampled_representation_joint::Array{<:AbstractFloat, 2}
sampled_exclusion_windows::Array{<:AbstractFloat, 3}
start_timestamp::AbstractFloat
end_timestamp::AbstractFloat

The transformed data that is fed into the search trees.

representation_joint::Array{<:AbstractFloat, 2}: Contains the history representation of the source, target and extra conditioning process at each target event. Has dimension $(l_X + l_{Z_1} + l_Y) \times N_X$. Rows 1 to $l_X$ (inclusive) contain the components relating to the target process. Rows $l_X + 1$ to $l_X + l_{Z_1}$ contain the components relating to the conditioning process. Rows $l_X + l_{Z_1} + 1$ to $l_X + l_{Z_1} + l_Y$ contain the components relating to the source process. A similar convention is used by sampled_representation_joint. Note that we do not include an array in this struct to keep track of the history embeddings for the conditioning variables. This is because this array is simply the first $l_X + l_{Z_1}$ rows of this array and so can easily be constructed on the fly later.
exclusion_windows::Array{<:AbstractFloat, 3}: Contains records of the time windows around each representation made at target events which must be excluded when doing $k$NN searches from that representation. By default, each representation has the window that is bound to the left by the first point in time that was used to make an embedding at that sample and to the right by the timestamp of the target event itself. An extra window might be added if the representation is a surrogate. In this case, the second window will be the original window of the representation with which the source component is swapped. Has dimension $N_E \times 2 \times N_X$. $N_E$ is the number of exclusion windows the representation has (one by default, two if it is a surrogate). Note that a single set of exclusion windows is used for the representations of both the joints and the conditionals. Using separate sets of windows would allow them to be slightly smaller, but the effect will be negligible for longer processes.
sampled_representation_joint::Array{<:AbstractFloat, 2}: Contains the history representation of the source, target and extra conditioning processes at each sample point. Has dimension $(l_X + l_Y + l_{Z_1}) \times N_U$. See the description of representation_joint for a description of how the variables are split accross the dimensions.
sampled_exclusion_windows::Array{<:AbstractFloat, 3}: Same as for the exclusion_windows, but contains the windows around the representations constructed at sample points.
start_timestamp::AbstractFloat: The raw timestamp of the first target event that is included in the analysis.
end_timestamp::AbstractFloat: The raw timestamp of the last target event that is included in the analysis.

source