Rongsheng Wang
M.S., Macao Polytechnic University (2024)
I am Rongsheng Wang, I used to study at Macao Polytechnic University (MPU) and work under the supervision of Prof.Tao Tan. My research interests are computer vision, LLM, trusted LLM, RAG, and Agent systems. I am also interested in generative data research. I am currently seeking a PhD opportunity.
I publish a lot of open source projects on 🔗Github . I shared some of the datasets and model weights used by the project at 🔗HuggingFace. I likewise like to post some small-parameter LM at 🔗Ollama to help people use it locally.
Macao Polytechnic University
M.S. in Big Data and Internet of Things Sep. 2022 - Jul. 2024
Henan Polytechnic University
B.S. in Computer Science (AI) Sep. 2018 - Jul. 2022
Junying Chen, Zhenyang Cai, Ke Ji, Xidong Wang, Wanlong Liu, Rongsheng Wang, Benyou Wang†(† corresponding author)
arXiv 2024 Conference
We introduce HuatuoGPT-o1, a medical LLM designed for advanced medical reasoning. It identifies mistakes, explores alternative strategies, and refines its answers by leveraging a specialized medical verifier. The model enhances reasoning through two key approaches: guiding complex reasoning trajectories for fine-tuning with the verifier and applying reinforcement learning (PPO) with verifier-based rewards to further improve reasoning.
Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize Chen, Zixu Zhang, Benyou Wang†(† corresponding author)
arXiv 2024 Conference
We introduce Med-MAT, a VQA dataset comprising 106 open-source medical datasets designed to advance generalization experiments and support the training of powerful medical multimodal large language models (MLLMs). This dataset highlights Compositional Generalization (CG) as a key mechanism, enabling MLLMs to better understand unseen images and achieve more data-efficient training.
Yaofei Duan, Patrick Cheong-Iao Pang, Ping He, Rongsheng Wang, Yue Sun, Chuntao Liu, Xiaorong Zhang, Xirong Yuan, Pengjie Song, Chan-Tong Lam, Ligang Cui, Tao Tan†(† corresponding author)
IEEE Journal of Biomedical and Health Informatics 2024 Journal
This study introduces "Multi-modal Multi-task Network" (3MT-Net), a deep learning architecture using clinical data, B-mode, and color Doppler ultrasound. 3MT-Net employs AM-CapsNet for tumor feature extraction, cross-attention for data fusion, and ensemble learning for optimization. Extensive testing on two datasets showed 3MT-Net outperforms the industrial-grade CAD product S-detect, achieving higher AUC.
Lin Li, Rongsheng Wang, Qimin Yang, Jiexin Chen, Patrick Cheong-Iao Pang, Yapeng Wang, Ka-Hou Chan, Tao Tan, Jie Ma†(† corresponding author)
RSNA’s Cutting-Edge Research 2024 ConferenceOral
We introduce XrayGLM, a conversational medical visual language model that analyzes and summarizes chest X-rays, aimed at improving domain-specific expertise for radiology tasks compared to general large models.
Qimin Yang, Rongsheng Wang, Jiexin Chen, Runqi Su, Tao Tan†(† corresponding author)
Long-Context Foundation Models (LCFM) at ICML 2024 2024 ConferencePoster
This study investigates the decline in long-context understanding for medical LLMs after domain-specific fine-tuning, conducting experiments to determine the best composition of general and medical training data to balance diagnostic knowledge with comprehensive reading abilities.
Xiaojuan Xue, Deshiwei Zhang, Chengyang Sun, Yiqiao Shi, Rongsheng Wang, Tao Tan, Peng Gao, Sujie Fan, Guangtao Zhai, Menghan Hu, Yue Wu†(† corresponding author)
Computers in Biology and Medicine 2024 Journal
We introduce Xiaoqing, an LLM model tailored for glaucoma developed through comparative and experiential experiments, demonstrating it can better serve glaucoma patients and medical research compared to general and clinical AI assistants by providing more informative and readable responses to glaucoma-related questions in Chinese.
Rongsheng Wang, Yaofei Duan, Yukun Li, Dashun Zheng, Xiaohong Liu, ChanTong Lam, Tao Tan†(† corresponding author)
The Visual Computer 2023 Journal
We propose a Parallel CNNs-Transformer network with multi-scale feature context aggregation (PCTMF-Net) for electrocardiogram heart sound classification, which combines CNNs and a transformer encoder to extract hierarchical features and achieves state-of-the-art performance on publicly available datasets.