LLM-driven Biomedical Knowledge

UniD3: Unified Drug-Disease Dataset Construction via KG-RAG

UniD³ orchestrates Llama3.3-70B with Knowledge Graph Retrieval-Augmented Generation to transform over 150,000 PubMed articles into structured, high-fidelity biomedical datasets. Our dual-stage entity extraction pipeline ensures consistent, noise-resistant graph construction.

150k+

PubMed publications

0.80+

F1 across tasks

0.9005

Expert F1 (DDM)

UniD³ Architecture

Drug-Disease Matching (DDM)

Identify high-confidence drug and disease relationships with contextual explanations grounded in PubMed evidence.

Drug Effectiveness Assessment (DEA)

Evaluate drug outcomes, effectiveness signals, and clinical directions across large-scale biomedical corpora.

Drug-Target Analysis (DTA)

Trace molecular targets, pathways, and intervention strategies by mining structured triplets.

Dual-Stage KG Construction

UniD³ first performs paper-level extraction to capture localized research context, then promotes consistent entities into a KG-level summary. LightRAG and custom prompts distill reliable triplets that fuel downstream QA and dataset generation pipelines.

Access Knowledge Graph on Zenodo