Acknowledgement:
This research was supported in part by the National Science Foundation award #1842183.
Disclaimer:
This work is sponsored by an agency of the United States Government. Neither the United States Government nor any agency thereof, nor any of their employees, makes any warranty, express or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or any agency thereof. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.
Principal Investigator:
Zoran Obradovic, L.H.Carnell Professor of Data Analytics, Temple University
Co-PI:
Eduard Dragut, Associate Professor, Computer and Information Sciences, Temple University
Abstract
Social media and news articles play an important role in documenting daily societal events. News outlets host social media platforms that facilitate users to engage in debating daily news topics. For example, the social networks at NY Times, The Guardian, and Washington Post have more than 130K users each. Together, they constitute a considerable segment of the varied opinions of the society at large. The difficult and high risk problem addressed in this project is that of transforming the streams of social media chatter at hundreds of news outlets into data signals from which to mine those signals foretelling the imminence of an (important) event, and to develop sound predictive analytics on top of those signals. The project benefits multiple segments of society, such as social scientists and policy makers, because the results of the proposed project provide tools to predict important real-life events using indicators observed on social media. There is growing interest in mining social media streams for early detection of (important) events, like crisis detection (and response) and predicting social unrest.
The objective of this project is to assess the feasibility of leveraging the trend of past social response to news articles observed over a few hundred social media streams to detect the emergence of social, economic, and political events. This project seeks creating a proof of concept that works with a few hundred social communities from news outlets. Specific aims consist of (i) developing methods for automatic data collection and (ii) efficient predictive modeling at that scale. The results (e.g., software tools) are made available to benefit researchers in academia and industry. Free, open-source software for implementing the developed techniques is distributed to enhance existing research infrastructure. The educational component of the project includes the involvement of graduate and undergraduate students’ training and research and the incorporation of research projects/results in appropriate courses.
Publications (peer-reviewed)
- He, L., Han, C., Mukherjee, A., Obradovic, Z., Dragut, E. (2020) “On the Dynamics of User Engagement in News Comment Media,” WIREs Data Mining and Knowledge Discovery, Vol. 10, issue 1, e1342.
- Pavlovski, M., Gligorijevic, J., Stojkovic, I., Agrawal, S., Komirishetty, S., Gligorijevic, Dj., Bhamidipati, N., Obradovic, Z. (in press) “Time-Aware User Embeddings as a Service,” Proc. 26th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 2020), San Diego, Aug. 2020
- Cao, X.H., Han, C., Glass, L.M., Kindman, A., Obradovic, Z., (2019) “Time-to-Event Estimation by Re-Defining Time,” Journal of Biomedical Informatics, Dec., vol. 100:103326
- Gligorijevic, J., Gligorijevic, Dj., Stojkovic, I., Bai, X., Goyai, A., Obradovic, Z. (2019) “Deeply Supervised Model for Click-Through Rate Prediction in Sponsored Search,” Data Mining and Knowledge Discovery, Sept. 2019, Vol. 33, Issue 5, pp. 1446-1467.
- Stanojevic, M., Alshehri, J., Obradovic Z. (2019) “Surveying Public Opinion Using Label Prediction on Social Media Data,” the IEEE/ACM Int’l Conf. Social Networks Analysis and Mining (ASONAM 2019), Vancouver, CA, Aug. 2019.
- Stanojevic, M., Alshehri, J., Dragut, E., Obradovic, Z. (2019) “Biased News Data Influence on Classifying Social Media Posts,” 3rd Int’l Workshop on Recent Trends in News Information Retrieval (NewsIR 2019), collocated with 42nd Int’l ACM SIGIR Conf. on Research Development in Information Retrieval, Paris, July, 2019.
- Han, C., Albarakati, N., Cao, X.H., Obradovic, Z. (2019) “A Distributable Convex Approach for Graph Structure Discovery,” 15th International Workshop on Mining and Learning with Graphs (MLG) 2019, held in conjunction with 25th ACM SIGKDD Conf. Knowledge Discovery and Data Mining (KDD 2019), Anchorage, Aug. 2019.
- Han, C., Cao, X.H., Stanojevic, M., Ghalwash, M., Obradovic, Z. (2019) “Temporal Graph Regression via Structure-Aware Intrinsic Representation Learning,” Proc. 19th SIAM Int’l Conf. Data Mining, Calgary, Canada, May 2019.
- Roychoudhury, S. Zhou, F., Obradovic, Z. (2019) “Leveraging Subsequence-orders for Univariate and Multivariate Time-series Classification,” Proc. 19th SIAM Int’l Conf. Data Mining, Calgary, Canada, May 2019.
Other Reports
- Pham, Q., Stanojevic, M. and Obradovic, Z., 2020. Extracting Entities and Topics from News and Connecting Criminal Records. arXiv preprint arXiv:2005.00950.
https://arxiv.org/abs/2005.00950
Data and Software
- https://github.com/marija-stanojevic/asonam19-paper
used at: Stanojevic, M., Alshehri, J., Obradovic Z. (2019) “Surveying Public Opinion Using Label Prediction on Social Media Data,” the IEEE/ACM Int’l Conf. Social Networks Analysis and Mining (ASONAM 2019), Vancouver, CA, Aug. 2019. - https://github.com/marija-stanojevic/newsir19-paper
used at: Stanojevic, M., Alshehri, J., Dragut, E., Obradovic, Z. (2019) “Biased News Data Influence on Classifying Social Media Posts,” 3rd Int’l Workshop on Recent Trends in News Information Retrieval (NewsIR 2019), collocated with 42nd Int’l ACM SIGIR Conf. on Research Development in Information Retrieval, Paris, July, 2019. - https://github.com/marija-stanojevic/crime-analysis-2020
used at: Pham, Q., Stanojevic, M. and Obradovic, Z., 2020. Extracting Entities and Topics from News and Connecting Criminal Records. arXiv preprint arXiv:2005.00950.
Undergraduate Students
- Quang Pham (advised by M. Stanojevic and Prof. Obradovic)
MS Students
- Parisa Khan (advised by J. Alshehri and Prof. Obradovic)
- Megha Patel (advised by L. He and Prof. Dragut)
PhD Students
- Nouf Albarakati (advised by Prof. Obradovic)
- Jovan Andjelkovic (advised by Prof. Obradovic)
- Jumanah Alshehri (advised by Prof. Obradovic)
- Lihong He (advised by Prof. Dragut)
- William Power (advised by Prof. Obradovic)
- Marija Stanojevic (advised by Prof. Obradovic)
Graduated PhD Students
- Dr. Xi Hang Cao (finished PhD in year 2019, advised by Prof. Obradovic)
- Dr. Jesse Glass (finished PhD in year 2020, advised by Prof. Obradovic)
- Dr. Chao Han (finished PhD in year 2019, advised by Prof. Obradovic)
- Dr. Shoumik Roychoudhury (finished PhD in year 2020, advised by Prof. Obradovic)