Results Block 4

LLM + KG Clinical Impact

We integrate LogosKG with large language models in a two-round manner to improve clinical diagnosis. In the first round, we use the knowledge graph retrieval results to filter the LLM's initial diagnosis. In the second round, we enhance the results from Round 1 by allowing the LLM to select additional evidence from the KG retrieval results.

Experimental Design Workflow

Experimental Design Workflow

Figure 2. Overview of the two-round LLM + Knowledge Graph retrieval and reasoning workflow.

Diagnosis Performance Across Hop Distances

Performance comparison across hop distances

Figure 3. F1 performance of Baseline, Round 1, and Round 2 across hop distances (k = 1 to 5).

Clinical Reasoning Quality (PDSQI-9)

PDSQI-9 clinical quality comparison

Figure 4. PDSQI-9 comparison for three models on DDXPlus with UMLS (k = 5).