When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair
This program is tentative and subject to change.
While Large Language Models (LLMs) have shown remarkable potential for automated program repair (APR), their effectiveness is often limited by the availability of high-quality training data. Proprietary industrial codebases are valuable datasets that can be used to enhance model performance but are inaccessible due to data privacy concerns, creating a major challenge for collaborative software development.
Despite that federated learning has emerged as a decentralized privacy-preserving solution, the prevailing paradigm in software engineering research remains centralized, requiring sensitive data to be pooled for model training, which is unacceptable during collaboration. Existing studies overlook challenges in fine-tuning LLMs for generative code-related tasks like program repair, particularly with real-world code heterogeneity in multi-organizational settings. They focus on labeled-data tasks, ignoring feature-skewed generation tasks common in software development, and evaluate only a narrow range of LLM architectures and sizes. Given the value of proprietary codebases, robust privacy is essential in collaborative, distributed development. Moreover, federated approaches to harness collective knowledge while preserving privacy remain largely unexplored.
To address the gap, we investigate federated learning as a privacy-preserving approach for fine-tuning LLMs on proprietary and decentralized data to boost collaborative software development and maintenance. We conduct a comprehensive empirical study to examine the effectiveness of federated learning for program repair to provide practical insights for real-world collaborative software development. Our study makes the following main contributions:
- An empirical study on federated fine-tuning of LLMs for program repair, demonstrating its feasibility and effectiveness of fine-tuning LLMs while {\bf preserving data privacy} in decentralized and collaborative software development.
- Analysis of federated fine-tuning’s impact on the generative code-related task (i.e., program repair), contrasting with prior federated learning work on discriminative tasks and revealing insights for other generative applications.
- Evaluation of a wide range of code LLMs in federated program repair, providing insights into the suitability and practicality of different LLMs in federated learning.
- Investigation of heterogeneous code’s effects on LLM repair capabilities in federated settings, illuminating the robustness and adaptability of LLMs in handling Non-IID data in decentralized environments.
- Assessment of various federated learning algorithms’ impact on LLM-based bug fixing, providing insights into the trade-offs for federated algorithm selection in program repair.
This program is tentative and subject to change.
Mon 17 NovDisplayed time zone: Seoul change
11:00 - 12:30 | |||
11:00 10mTalk | Defects4C: Benchmarking Large Language Model Repair Capability with C/C++ Bugs Research Papers Jian Wang Nanyang Technological University, Xiaofei Xie Singapore Management University, Qiang Hu Tianjin University, Shangqing Liu Nanjing University, Jiongchi Yu Singapore Management University, Jiaolong Kong Singapore Management University, Yi Li Nanyang Technological University | ||
11:10 10mTalk | MORepair: Teaching LLMs to Repair Code via Multi-Objective Fine-Tuning Journal-First Track Boyang Yang Yanshan University, Haoye Tian Aalto University, Jiadong Ren Yanshan University, Hongyu Zhang Chongqing University, Jacques Klein University of Luxembourg, Tegawendé F. Bissyandé University of Luxembourg, Claire Le Goues Carnegie Mellon University, Shunfu Jin Yanshan University Link to publication DOI Pre-print | ||
11:20 10mTalk | When Fine-Tuning LLMs Meets Data Privacy: An Empirical Study of Federated Learning in LLM-Based Program Repair Journal-First Track Wenqiang LUO City University of Hong Kong, Jacky Keung City University of Hong Kong, Boyang Yang Yanshan University, He Ye University College London (UCL), Claire Le Goues Carnegie Mellon University, Tegawendé F. Bissyandé University of Luxembourg, Haoye Tian Aalto University, Xuan-Bach D. Le University of Melbourne | ||
11:30 10mTalk | Test-based Patch Clustering for Automatically-Generated Patches Assessment Journal-First Track Matias Martinez Universitat Politècnica de Catalunya (UPC), Maria Kechagia National and Kapodistrian University of Athens, Anjana Perera Oracle Labs, Australia, Justyna Petke University College London, Federica Sarro University College London, Aldeida Aleti Monash University | ||
11:40 10mTalk | Hierarchical Knowledge Injection for Improving LLM-based Program Repair Research Papers Ramtin Ehsani Drexel University, Esteban Parra Rodriguez Belmont University, Sonia Haiduc Florida State University, Preetha Chatterjee Drexel University, USA | ||
11:50 10mTalk | Characterizing Multi-Hunk Patches: Divergence, Proximity, and LLM Repair Challenges Research Papers Noor Nashid University of British Columbia, Daniel Ding University of British Columbia, Keheliya Gallaba Centre for Software Excellence, Ahmed E. Hassan Queen’s University, Ali Mesbah University of British Columbia | ||
12:00 10mTalk | Reinforcement Learning for Mutation Operator Selection in Automated Program Repair Journal-First Track Carol Hanna University College London, Aymeric Blot University of Rennes, IRISA / INRIA, Justyna Petke University College London | ||
12:20 10mTalk | Seeing is Fixing: Cross-Modal Reasoning with Multimodal LLMs for Visual Software Issue Repair Research Papers Kai Huang Technical University of Munich, Jian Zhang Nanyang Technological University, Xiaofei Xie Singapore Management University, Chunyang Chen TU Munich | ||