Internal
CoTETE.estimate_TE_from_preprocessed_data
— Functionestimate_TE_from_preprocessed_data(
parameters::CoTETEParameters,
preprocessed_data::PreprocessedData;
AIS_only::Bool = false,
)
calculates the TE using the preprocessed data and the given parameters.
Note that this is an implementation of the algorithm described in Box 1 of our paper our paper. Consulting that algorithm and the surrounding text is recommended.
julia> source = sort(1e4*rand(Int(1e4)));
julia> target = sort(1e4*rand(Int(1e4)));
julia> parameters = CoTETE.CoTETEParameters(l_x = 1, l_y = 1);
julia> preprocessed_data = CoTETE.preprocess_event_times(parameters, target, source_events = source);
julia> TE = CoTETE.estimate_TE_from_preprocessed_data(parameters, preprocessed_data);
julia> abs(TE - 0) < 0.05 # For Doctesting purposes
true
CoTETE.transform_marginals_to_uniform!
— Functionfunction transform_marginals_to_uniform!(preprocessed_data::PreprocessedData)
Independently transforms each dimension of the history embeddings to be uniformly distributed.
CoTETE.transform_marginals_to_uniform_on_cdf!
— Functionfunction transform_marginals_to_uniform_on_cdf!(
preprocessed_data::PreprocessedData,
history_embeddings::Array{<:AbstractFloat,2},
)
Helper method reused for transforming both the embeddings sampled at events and those sampled
randomly.
CoTETE.make_surrogate!
— Functionfunction make_surrogate!(
parameters::CoTETEParameters,
preprocessed_data::PreprocessedData,
target_events::Array{<:AbstractFloat},
source_events::Array{<:AbstractFloat};
conditioning_events::Array{<:AbstractFloat} = Float32[],
)
Edit the source component of preprocessed_data.representation_joint
such that it conforms to the null hypothesis of conditional independence.
CoTETE.make_AIS_surrogate!
— Functionfunction make_AIS_surrogate!(
parameters::CoTETEParameters,
preprocessed_data::PreprocessedData,
target_events::Array{<:AbstractFloat},
)
Edit preprocessed_data.representation_joint
such that it conforms to the null hypothesis of the target histories being independent of events in the target process.
CoTETE.construct_sample_points_array
— Functionfunction construct_sample_points_array(
parameters::CoTETEParameters,
num_samples::Integer,
start_timestamp::AbstractFloat,
end_timestamp::AbstractFloat,
target_events::Array{<:AbstractFloat}
)
Constructs the array of random sample points according to the chosen method.
CoTETE.preprocess_event_times
— Functionfunction preprocess_event_times(
parameters::CoTETEParameters,
target_events::Array{<:AbstractFloat};
source_events::Array{<:AbstractFloat} = Float32[],
conditioning_events::Array{<:AbstractFloat} = Float32[]
)
Use the raw event times to create the history embeddings and other prerequisites for estimating the TE.
julia> parameters = CoTETE.CoTETEParameters(l_x = 1, l_y = 1);
julia> source = cumsum(ones(5)) .- 0.5; # source is {0.5, 1.5, 2.5, ...}
julia> target = cumsum(ones(5)); # target is {1, 2, 3, ...}
julia> preprocessed_data = CoTETE.preprocess_event_times(parameters, target, source_events = source);
julia> println(preprocessed_data.representation_joint) # All target events will be one unit back, all source events 0.5 units
[1.0 1.0 1.0 1.0; 0.5 0.5 0.5 0.5]
CoTETE.make_embeddings_along_observation_time_points
— Functionfunction make_embeddings_along_observation_time_points(
observation_time_points::Array{<:AbstractFloat},
start_observation_time_point::Integer,
num_observation_time_points_to_use::Integer,
event_time_arrays::Array{<:Array{<:AbstractFloat,1},1},
embedding_lengths::Array{<:Integer},
)
Constructs a set of embeddings from a set of observation points. The observation points and the raw event times are assumed to be sorted. Also returns the exlcusion windows.
Example
julia> source = cumsum(ones(20)) .- 0.5; # source is {0.5, 1.5, 2.5, ...}
julia> conditional = cumsum(ones(20)) .- 0.25; # conditional is {0.75, 1.75, 2.75, ...}
julia> target = cumsum(ones(20)); # target is {1, 2, 3, ...}
julia> observation_points = cumsum(ones(20)) .- 0.75; # observation points are {0.25, 1.25, 2.25, ...}
julia> CoTETE.make_embeddings_along_observation_time_points(observation_points, 5, 3, [target, source, conditional], [2, 1, 1], true)
([0.25 0.25 0.25; 1.0 1.0 1.0; 0.75 0.75 0.75; 0.5 0.5 0.5], [3.0 4.25;;; 4.0 5.25;;; 5.0 6.25])
CoTETE.make_one_embedding
— Functionfunction make_one_embedding(
observation_time_point::AbstractFloat,
event_time_arrays::Array{<:Array{<:AbstractFloat, 1}, 1},
most_recent_event_indices::Array{<:Integer},
embedding_lengths::Array{<:Integer},
)
Constructs the history embedding from a given point in time. Also returns the timestamp of the earliest event used in the construction of the embedding. This is used for recording the exclusion windows.
Example
julia> source = cumsum(ones(20)) .- 0.5; # source is {0.5, 1.5, 2.5, ...}
julia> conditional = cumsum(ones(20)) .- 0.25; # conditional is {0.75, 1.75, 2.75, ...}
julia> target = cumsum(ones(20)); # target is {1, 2, 3, ...}
julia> CoTETE.make_one_embedding(5.25, [target, source, conditional], [5, 5, 5], [2, 3, 1])
(Any[0.25, 1.0, 0.75, 1.0, 1.0, 0.5], 2.5)
CoTETE.PreprocessedData
— Typerepresentation_joint::Array{<:AbstractFloat, 2}
exclusion_windows::Array{<:AbstractFloat, 3}
sampled_representation_joint::Array{<:AbstractFloat, 2}
sampled_exclusion_windows::Array{<:AbstractFloat, 3}
start_timestamp::AbstractFloat
end_timestamp::AbstractFloat
The transformed data that is fed into the search trees.
representation_joint::Array{<:AbstractFloat, 2}
: Contains the history representation of the source, target and extra conditioning process at each target event. Has dimension $(l_X + l_{Z_1} + l_Y) \times N_X$. Rows 1 to $l_X$ (inclusive) contain the components relating to the target process. Rows $l_X + 1$ to $l_X + l_{Z_1}$ contain the components relating to the conditioning process. Rows $l_X + l_{Z_1} + 1$ to $l_X + l_{Z_1} + l_Y$ contain the components relating to the source process. A similar convention is used bysampled_representation_joint
. Note that we do not include an array in this struct to keep track of the history embeddings for the conditioning variables. This is because this array is simply the first $l_X + l_{Z_1}$ rows of this array and so can easily be constructed on the fly later.exclusion_windows::Array{<:AbstractFloat, 3}
: Contains records of the time windows around each representation made at target events which must be excluded when doing $k$NN searches from that representation. By default, each representation has the window that is bound to the left by the first point in time that was used to make an embedding at that sample and to the right by the timestamp of the target event itself. An extra window might be added if the representation is a surrogate. In this case, the second window will be the original window of the representation with which the source component is swapped. Has dimension $N_E \times 2 \times N_X$. $N_E$ is the number of exclusion windows the representation has (one by default, two if it is a surrogate). Note that a single set of exclusion windows is used for the representations of both the joints and the conditionals. Using separate sets of windows would allow them to be slightly smaller, but the effect will be negligible for longer processes.sampled_representation_joint::Array{<:AbstractFloat, 2}
: Contains the history representation of the source, target and extra conditioning processes at each sample point. Has dimension $(l_X + l_Y + l_{Z_1}) \times N_U$. See the description ofrepresentation_joint
for a description of how the variables are split accross the dimensions.sampled_exclusion_windows::Array{<:AbstractFloat, 3}
: Same as for theexclusion_windows
, but contains the windows around the representations constructed at sample points.start_timestamp::AbstractFloat
: The raw timestamp of the first target event that is included in the analysis.end_timestamp::AbstractFloat
: The raw timestamp of the last target event that is included in the analysis.