This dataset contains the source code for the Survival Factorization Framework published as:
Nicola Barbieri, Giuseppe Manco, Ettore Ritacco: Survival Factorization on Diffusion Networks. THE EUROPEAN CONFERENCE ON MACHINE LEARNING & PRINCIPLES AND PRACTICE OF KNOWLEDGE DISCOVERY IN DATABASES, 2017.
The dataset is in .zip format that can be uncompressed by standard and openly accessible file zip utilities. Code is stored in .java and .jar files that can be accessed and edited by standard and openly accessible text edit software. Testing and training datasets containing Users and Cascade Timestamps are available in various text file formats. Figures and tables from the related publication are included in .pdf format.
See the description below for more detail on file formats and instructions on building the model and Network Reconstruction.
In the related paper we propose a survival factorization framework that models information cascades by tying together social influence patterns, topical structure and temporal dynamics. This is achieved through the introduction of a latent space which encodes: (a) the relevance of a information cascade on a topic; (b) the topical authoritativeness and the susceptibility of each individual involved in the information cascade, and (c) temporal topical patterns. By exploiting the cumulative properties of the survival function and of the likelihood of the model on a given adoption log, which records the observed activation times of users and side-information for each cascade, we show that the inference phase is linear in the number of users and in the number of adoptions. The evaluation on both synthetic and real-world data shows the effectiveness of the model in detecting the interplay between topics and social influence patterns, which ultimately provides high accuracy in predicting users activation times.
###############################
How to Build the Model:
Run survivalFactorizationEM.SurvivalFactorizationEM_Runner providing a path of a configuration file. Given a dataset, this script generates several instances of class survivalFactorizationEM.SurvivalFactorizationEM_Model.The configuration file must to be written according to the .”properties” syntax. The fields are:
n_factors = output = max_iterations = assignment_file = event_file = [ content_file = ]
where
: an integer list separated by “;” e.g. n_factors = 2;4;8;16;32;64;128This list sets the num...