People
About me
I am an Associate Professor of Computer Science within the SySMA research unit of IMT Lucca.
Currently, I am Research Associate to HPC Lab at ISTI, CNR, Pisa
I worked as a research scientist at IBM Research, Ireland, Dublin; then, I held senior data scientist positions at Tiscali , Cloud4Wi, and Vodafone. I received my PhD in Information Engineering from the University of Pisa in 2010, where I worked at KDD Lab, CNR, Pisa. During my PhD, I was a visiting researcher at Senseable City Lab at M.I.T., Cambridge, MA, US.
News
Security in Federated and Distributed Machine Learning and Artificial Intelligence Environments
Contact persons: Fabio Pinelli, Alessandro Betti
Curriculum: Software, System and Infrastructure Security
Funds: University
Additional benefits: Full board accommodation
Website: https://sysma.imtlucca.it
Description
Modern technologies increasingly use federated learning, which trains machine learning models across decentralised devices without transferring data to a central server, thereby enhancing privacy. For example, the next-word predictions on Gboard for Android devices are generated using this approach.
However, Implementing federated and distributed machine learning systems has introduced new challenges and opportunities in the cybersecurity landscape. These systems enable collaboration among different nodes, allowing models to be trained on distributed data without centralising the data themselves. However, this decentralisation introduces potential security vulnerabilities that must be effectively addressed to ensure data integrity and confidentiality.
The objective of the thesis is to address the following challenges and goals:
- Vulnerability Analysis: Conduct a detailed analysis of existing vulnerabilities in federated and distributed machine learning systems, including privacy threats, model manipulation attacks, and potential data security breaches.
- Development of Defence Techniques: Design and develop new defence techniques to mitigate the identified vulnerabilities, using approaches such as homomorphic encryption, secure and robust aggregation methods, and other advanced methods. The effectiveness of these defence techniques is evaluated through a series of case studies and practical experiments.
- Integration: This also requires integrating the proposed solutions into existing federated learning frameworks and scenarios correlating the theoretical and practical aspects of the identified problems.
From Threat to Tool: Fine-Tuning LLMs to Combat Disinformation in Digital Media
Contact person: Fabio Pinelli
Curriculum: Data Governance & Protection
Funds: MUR DM 630— scholarship co-funded by a research institution where the student will spend 6 to 18 months of the PhD.
Additional benefits: Full board accommodation
Research Institution: CNR-IIT
Research Institution Contact Person: Marinella Petrocchi
Website: https://sysma.imtlucca.it/, https://marinellapetrocchi.wixsite.com/mysite
Description
The proliferation of large language models (LLMs) such as GPT-3 and GPT-4 has revolutionized natural language processing and various applications ranging from automated customer service to advanced tools for information retrieval. However, their potential to spread disinformation poses significant challenges. This PhD project aims to explore the dual role of LLMs as both enablers and mitigators of disinformation. By examining the mechanisms through which LLMs generate, amplify, and disseminate false information, we seek to understand their impact on public discourse and trust in digital media. The research will employ a multi-disciplinary approach, integrating computational linguistics, machine learning, and social sciences, to analyze large datasets from social media and other digital platforms. Key objectives include identifying patterns in LLM-generated disinformation, evaluating the efficacy of current mitigation strategies, and developing new techniques to enhance model transparency and accountability.
The main outcome of this study is envisioned to be the development of a fine-tuned LLM specifically designed to detect and mitigate disinformation. This model will be trained on a curated dataset containing both accurate information and known disinformation. The fine-tuned LLM will be optimized to recognize and counteract false narratives.
Research Activities
My research interests are Data Mining and Machine Learning and their application in different domains. In my early scientific career, I have mainly focused on the development of Data Mining frameworks for spatiotemporal data to be applied to Urban Dynamics and Intelligent Transportation systems. In the most recent years, I have worked on machine learning pipelines for business and marketing problems.
Currently, I am working on:
Deep learning methods on mobile phone sensor data (e.g., GPS trajectories, Human Activity, etc. )
Security on Federated Learning Frameworks
Trustworthiness of news
Applied machine learning (e.g., economics, blockchain, etc.)
Recent Publications
C. Pugliese, F. Lettich, F. Pinelli, C. Renso. Understanding Human Mobility Dynamics: Insights from Summarized Semantic Trajectories, 25th IEEE International Conference on Mobile Data Management (MDM), 2024 (short paper)
L. Galletta, F. Pinelli. Explainable Ponzi Schemes Detection on Ethereum, 39th ACM/SIGAPP Symposium on Applied Computing
J. Bianchi, M. Pratelli, M. Petrocchi, F. Pinelli. Evaluating Trustworthiness of Online News Publishers via Article Classification, 39th ACM/SIGAPP Symposium on Applied Computing
C. Pugliese, F. Lettich, F. Pinelli, C. Renso. Summarizing Trajectories Using Semantically Enriched Geographical Context. SIGSPATIAL 2023
F. Lettich, C. Pugliese, C. Renso, F. Pinelli. Semantic Enrichment of Mobility Data: A Comprehensive Methodology and the MAT-BUILDER System IEEE Access, 2023
F. Lettich, C. Pugliese, C. Renso, F. Pinelli. A general methodology for building multiple aspect trajectories Proceedings of the 38th ACM/SIGAPP Symposium on Applied Computing, 515-517
G. Costa, F. Pinelli, S. Soderi, G. Tolomei. Turning Federated Learning Systems Into Covert Channels IEEE Access 10, 130642-130656
C. Pugliese, F. Lettich, C. Renso, F. Pinelli. Mat-builder: a system to build semantically enriched trajectories 2022 23rd IEEE International Conference on Mobile Data Management (MDM), 274-277, 2, 2022
Selected Publications
C. Pugliese, F. Lettich, F. Pinelli, C. Renso. Summarizing Trajectories Using Semantically Enriched Geographical Context. SIGSPATIAL 2023
F. Pinelli, R. Nair, F. Calabrese, G. Di Lorenzo, M. L. Sbodio, and M. Berlingerio. Data-driven transit network design from mobile phone trajectories. IEEE Transactions on Intelligent Transportation Systems, 2016.
G. Di Lorenzo, M., F. Calabrese, M. Berlingerio, F. Pinelli, and R. Nair. Allaboard: Visual exploration of cellphone mobility data to optimise public transport. IEEE Transactions on Visualization and Computer Graphics, 2016.
Y. Dong, F. Pinelli, Y. Gkoufas, Z. Nabi, F. Calabrese, and N. V. Chawla. Inferring unusual crowd events from mobile phone call detail records. ECML/PKDD, 2015.
F. Pinelli, F. Calabrese, and E. Bouillet. A methodology for denoising and generating bus infrastructure data, IEEE Transactions on Intelligent Transportation Systems, 2014.
M. Berlingerio, F. Pinelli, F. Calabrese. Abacus: frequent pattern mining-based community discovery in multidimensional networks. Data Mining and Knowledge Discovery, 2013.
F. Giannotti, M. Nanni, D. Pedreschi, F. Pinelli, C. Renso, S. Rinzivillo, R. Trasarti. Unveiling the complexity of human mobility by querying and mining massive trajectory data. The VLDB Journal, 2011.
R. Trasarti, F. Pinelli, M. Nanni, and F. Giannotti. Mining mobility user profiles for car pooling. ACM SIGKDD, 2011.
A. Monreale, F. Pinelli, R. Trasarti, and F. Giannotti. Wherenext: a location predictor on trajectory pattern mining. ACM SIGKDD, 2009.
M. Berlingerio, F. Pinelli, M. Nanni, and F. Giannotti. Temporal mining for interactive workflow data analysis. ACM SIGKDD, 2009.
F. Giannotti, M. Nanni, F. Pinelli, and D. Pedreschi. Trajectory pattern mining. ACM SIGKDD, 2007.
See my Google Scholar profile or my DBLP page for a full list of publications.
Patents
F. Calabrese and F. Pinelli. Public transportation fare evasion inference using personal mobility data, 2014.
A. Botea, M. Berlingerio, E. Bouillet, F. Calabrese, and F. Pinelli. System for inferring inconvenient traveller experience in journeys, 2013.
R. Nair, F. Pinelli, and F. Calabrese. Real-time system to predict and correct scheduled service bunching, 2013.
E. Bouillet, F. Calabrese, F. Pinelli, M. Sinn, and J. Yoon. Estimation of arrival times at transit stops, 2012.
E. Bouillet, F. Calabrese, F. Pinelli, and O. Verscheure. De-noising scheduled transportation data, 2012.