Biomarker profiling in sepsis 
diagnostics 
 
 
 
 
Mahnaz Irani Shemirani 
 
 
 
Department of Laboratory Medicine 
Institute of Biomedicine 
Sahlgrenska Academy, University of Gothenburg 
 
 
 
 
 
Gothenburg 2024 
 
 
i 
 
  
Cover illustration: “The Bacterial Sepsis”. © Mahnaz Irani Shemirani 2024 To my mom, Iran, whose encouragement and guidance illuminated my path in 
 academia; to my brother; Jamshid, for his unwavering moral and medical 
support; to my dad’s enduring spirit; Khosro, who taught me resilience during 
 his brief time with me; and to my sister’s enduring spirit; Mahshid, who shared 
 my dream of earning doctoral degree. 
  
  
  
   
 
 
 
 
 
 
 
 
 
Biomarker profiling in sepsis diagnostics 
© Mahnaz Irani Shemirani 2024 
mahnaz.irani.shemirani@gu.se 
 
ISBN 978-91-8069-215-1 (PRINT)  
ISBN 978-91-8069-216-8 (PDF) 
NENMNMÄ
VANE
ÄRRK
 VA
KEE
Printed in Borås, Sweden 2024 
Printed by Stema Specialtryck AB Trycksak3T0r4y1c k0s2a3k43041 0234
ii iii 
  
SS
TT
  
Cover illustration: “The Bacterial Sepsis”. © Mahnaz Irani Shemirani 2024 To my mom, Iran, whose encouragement and guidance illuminated my path in 
 academia; to my brother; Jamshid, for his unwavering moral and medical 
support; to my dad’s enduring spirit; Khosro, who taught me resilience during 
 his brief time with me; and to my sister’s enduring spirit; Mahshid, who shared 
 my dream of earning doctoral degree. 
  
  
  
   
 
 
 
 
 
 
 
 
 
Biomarker profiling in sepsis diagnostics 
© Mahnaz Irani Shemirani 2024 
mahnaz.irani.shemirani@gu.se 
 
ISBN 978-91-8069-215-1 (PRINT)  
ISBN 978-91-8069-216-8 (PDF) 
 
Printed in Borås, Sweden 2024 
Printed by Stema Specialtryck AB 
ii iii 
  
  
 ABSTRACT 
 Effective and timely antibiotic therapy for sepsis requires a thorough understanding of the types and molecular characteristics of bacterial strains. 
Therefore, we investigated diagnostic strategies to facilitate faster 
 classification of bacteria and identification of their molecular features. 
We benchmarked the 1928 Diagnostic platform (1928 Diagnostics, 
 Gothenburg, Sweden) for characterizing Staphylococcus aureus (S. aureus) strains against an in-house bioinformatics (INH) pipeline and reference clinical 
laboratory methods, including MALDI-TOF MS and phenotypic antibiotic 
 susceptibility testing. We observed a high agreement between the 1928 
platform and the INH pipeline in predicting laboratory results. Notably, the 
 1928 platform exhibited a lower rate of false negative while showing slightly higher rates of false positive (Paper I). Additionally, our findings revealed that 
clindamycin, erythromycin, and fusidic acid exhibited efficacy against all 
 methicillin resistance S. aureus strains, and vancomycin demonstrated 
susceptibility in all tested strains (Paper II). The challenge remains in 
 predicting the bacterial type. Several studies highlighted the differences between blood markers of gram-positive and gram-negative bacterial sepsis. 
Using machine learning algorithms and Proximity Extension Assay (PEA), we 
 discovered a set of informative proteins comprising 55 proteins, including 5 potential biomarkers, which distinguish patients with gram-positive or gram-
negative bacteria from other cases, achieving AUCs of 0.66 and 0.69, 
 respectively (Paper III). However, while the analysis of 55 proteins offered 
insights into classifying bacterial types, our method did not distinguish 
 between specific bacterial strains. Employing a more comprehended approach utilizing whole blood microarray technology on septic patients infected with 
either S. aureus or Escherichia coli revealed 25 genes with high AUC values 
 (0.98 and 0.96, respectively) that effectively distinguished these infections 
from other cases. These findings were consistent across two separate 
 independent datasets, with AUC values ranging from 0.72 to 0.87 (Paper IV).  
In conclusion, efforts to improve diagnostic strategies and understand 
 bacterial characteristics in sepsis continue. Platforms like 1928 Diagnostics and technologies such as the PEA show promise, with machine learning 
offering opportunities to tackle bacterial typing challenges. These 
 advancements are crucial for evolving clinical practices in sepsis diagnosis and management.  
 Keywords: pipeline, whole genome sequencing, machine learning, biomarker, proteomics, transcriptomics, gram-negative bacteria, gram-positive bacteria 
 ISBN 978-91-8069-215-1 (PRINT)  
ISBN 978-91-8069-216-8 (PDF)  
iv v 
  
  
 ABSTRACT 
 Effective and timely antibiotic therapy for sepsis requires a thorough understanding of the types and molecular characteristics of bacterial strains. 
Therefore, we investigated diagnostic strategies to facilitate faster 
 classification of bacteria and identification of their molecular features. 
We benchmarked the 1928 Diagnostic platform (1928 Diagnostics, 
 Gothenburg, Sweden) for characterizing Staphylococcus aureus (S. aureus) strains against an in-house bioinformatics (INH) pipeline and reference clinical 
laboratory methods, including MALDI-TOF MS and phenotypic antibiotic 
 susceptibility testing. We observed a high agreement between the 1928 
platform and the INH pipeline in predicting laboratory results. Notably, the 
 1928 platform exhibited a lower rate of false negative while showing slightly higher rates of false positive (Paper I). Additionally, our findings revealed that 
clindamycin, erythromycin, and fusidic acid exhibited efficacy against all 
 methicillin resistance S. aureus strains, and vancomycin demonstrated 
susceptibility in all tested strains (Paper II). The challenge remains in 
 predicting the bacterial type. Several studies highlighted the differences between blood markers of gram-positive and gram-negative bacterial sepsis. 
Using machine learning algorithms and Proximity Extension Assay (PEA), we 
 discovered a set of informative proteins comprising 55 proteins, including 5 potential biomarkers, which distinguish patients with gram-positive or gram-
negative bacteria from other cases, achieving AUCs of 0.66 and 0.69, 
 respectively (Paper III). However, while the analysis of 55 proteins offered 
insights into classifying bacterial types, our method did not distinguish 
 between specific bacterial strains. Employing a more comprehended approach utilizing whole blood microarray technology on septic patients infected with 
either S. aureus or Escherichia coli revealed 25 genes with high AUC values 
 (0.98 and 0.96, respectively) that effectively distinguished these infections 
from other cases. These findings were consistent across two separate 
 independent datasets, with AUC values ranging from 0.72 to 0.87 (Paper IV).  
In conclusion, efforts to improve diagnostic strategies and understand 
 bacterial characteristics in sepsis continue. Platforms like 1928 Diagnostics and technologies such as the PEA show promise, with machine learning 
offering opportunities to tackle bacterial typing challenges. These 
 advancements are crucial for evolving clinical practices in sepsis diagnosis and management.  
 Keywords: pipeline, whole genome sequencing, machine learning, biomarker, proteomics, transcriptomics, gram-negative bacteria, gram-positive bacteria 
 ISBN 978-91-8069-215-1 (PRINT)  
ISBN 978-91-8069-216-8 (PDF)  
iv v 
  
  
SAMMANFATTNING PÅ SVENSKA  
Effektiv och snabb antibiotikabehandling för sepsis kräver en grundlig 
förståelse av typerna och de molekylära egenskaperna hos bakteriestammar. 
Dock saknar konventionella diagnostiska metoder ofta den hastighet som är 
nödvändig för att noggrant fastställa dessa egenskaper. Därför har vi undersökt 
diagnostiska strategier för att underlätta den snabba klassificeringen av 
bakterier och belysa deras molekylära egenskaper. Vi jämförde den 
kommersiella molnbaserade plattformen, 1928-plattformen (1928 Diagnostics, 
Göteborg, Sverige) för karakterisering av Staphylococcus aureus (S. aureus) -
stam mot en intern bioinformatik (INH) pipeline samt till referensmetoder i det 
kliniska laboratoriet; MALDI-TOF MS och fenotypisk 
antibiotikakänslighetstestning (AST). Vår analys visade en hög 
överensstämmelse mellan 1928-plattformen och INH-pipelinen i förutsägelsen 
av MALDI-TOF MS och AST-resultat. Noterbart var att 1928-plattformen 
hade en lägre frekvens av felaktig identifiering av resistenta fenotyper, medan 
den visade något högre frekvenser av felaktig identifiering av känsliga 
fenotyper (Studie I). Dessutom visade våra fynd att klindamycin, erytromycin 
och fusidinsyra uppvisade effekt mot alla meticillinresistens S. aureus-
stammar, och vankomycin visade känslighet i alla testade stammar (Studie II). 
Utmaningen kvarstår i att förutsäga bakterietypen. Flera studier visade på 
skillnaderna mellan blodmarkörer för gram-positiv och gram-negativ 
bakteriell sepsis. Med hjälp av maskininlärningsalgoritmer och Proximity 
Extension Assay (PEA) upptäckte vi en panel av informativa proteiner som 
omfattar 55 proteiner, inklusive 5 potentiella biomarkörer, som skiljer patienter 
med gram-positiva eller gram-negativa bakterier från andra fall och uppnår 
AUC-värden på 0,66 och 0,69 respektive (Studie III). Men medan analysen 
av 55 proteiner erbjöd insikter i att klassificera bakterietyper, skilde vår metod 
inte mellan specifika bakteriestammar. Genom att använda en mer omfattande 
metod som involverar helblodsmikroarrayteknik på septiska patienter 
infekterade med antingen S. aureus eller Escherichia coli identifierades 25 
gener med höga AUC-värden (0.98 och 0.96, respektive) som effektivt 
särskilde dessa infektioner från andra fall. Dessa fynd var konsekventa över 
två separata, oberoende dataset, med AUC-värden som sträckte sig från 0,72 
till 0,87 (Studie IV). 
Sammanfattningsvis fortsätter ansträngningarna att förbättra diagnostiska 
strategier och förstå bakteriella egenskaper vid sepsis. Plattformar som 1928 
Diagnostics och teknologier som PEA visar lovande, med maskininlärning 
som erbjuder möjligheter att ta itu med bakterietypningsutmaningar. Dessa 
framsteg är avgörande för att utveckla klinisk praxis för diagnostik och 
behandling av sepsis. 
vi vii 
  
  
SAMMANFATTNING PÅ SVENSKA  
Effektiv och snabb antibiotikabehandling för sepsis kräver en grundlig 
förståelse av typerna och de molekylära egenskaperna hos bakteriestammar. 
Dock saknar konventionella diagnostiska metoder ofta den hastighet som är 
nödvändig för att noggrant fastställa dessa egenskaper. Därför har vi undersökt 
diagnostiska strategier för att underlätta den snabba klassificeringen av 
bakterier och belysa deras molekylära egenskaper. Vi jämförde den 
kommersiella molnbaserade plattformen, 1928-plattformen (1928 Diagnostics, 
Göteborg, Sverige) för karakterisering av Staphylococcus aureus (S. aureus) -
stam mot en intern bioinformatik (INH) pipeline samt till referensmetoder i det 
kliniska laboratoriet; MALDI-TOF MS och fenotypisk 
antibiotikakänslighetstestning (AST). Vår analys visade en hög 
överensstämmelse mellan 1928-plattformen och INH-pipelinen i förutsägelsen 
av MALDI-TOF MS och AST-resultat. Noterbart var att 1928-plattformen 
hade en lägre frekvens av felaktig identifiering av resistenta fenotyper, medan 
den visade något högre frekvenser av felaktig identifiering av känsliga 
fenotyper (Studie I). Dessutom visade våra fynd att klindamycin, erytromycin 
och fusidinsyra uppvisade effekt mot alla meticillinresistens S. aureus-
stammar, och vankomycin visade känslighet i alla testade stammar (Studie II). 
Utmaningen kvarstår i att förutsäga bakterietypen. Flera studier visade på 
skillnaderna mellan blodmarkörer för gram-positiv och gram-negativ 
bakteriell sepsis. Med hjälp av maskininlärningsalgoritmer och Proximity 
Extension Assay (PEA) upptäckte vi en panel av informativa proteiner som 
omfattar 55 proteiner, inklusive 5 potentiella biomarkörer, som skiljer patienter 
med gram-positiva eller gram-negativa bakterier från andra fall och uppnår 
AUC-värden på 0,66 och 0,69 respektive (Studie III). Men medan analysen 
av 55 proteiner erbjöd insikter i att klassificera bakterietyper, skilde vår metod 
inte mellan specifika bakteriestammar. Genom att använda en mer omfattande 
metod som involverar helblodsmikroarrayteknik på septiska patienter 
infekterade med antingen S. aureus eller Escherichia coli identifierades 25 
gener med höga AUC-värden (0.98 och 0.96, respektive) som effektivt 
särskilde dessa infektioner från andra fall. Dessa fynd var konsekventa över 
två separata, oberoende dataset, med AUC-värden som sträckte sig från 0,72 
till 0,87 (Studie IV). 
Sammanfattningsvis fortsätter ansträngningarna att förbättra diagnostiska 
strategier och förstå bakteriella egenskaper vid sepsis. Plattformar som 1928 
Diagnostics och teknologier som PEA visar lovande, med maskininlärning 
som erbjuder möjligheter att ta itu med bakterietypningsutmaningar. Dessa 
framsteg är avgörande för att utveckla klinisk praxis för diagnostik och 
behandling av sepsis. 
vi vii 
  
  
LIST OF PAPERS  ADDITIONAL PAPERS 
This thesis is based on the following studies, referred to in the text by their I. Irani Shemirani, M. Biomarkers approach in the diagnosis and 
roman numerals. prognosis of sepsis. Int. J. Public Health Res 2022, 12 (2).  
I. Shemirani, M.I., Tilevik, D., Tilevik, A., Jurcevic, S.,  
Arnellos, D., Enroth, H., Pernestig, A.K. Benchmarking of 
two bioinformatic workflows for the analysis of whole-  
genome sequenced Staphylococcus aureus collected from  
patients with suspected sepsis. BMC infect dis 2023, 23(1),  
39.   
DOI:10.1186/s12879-022-07977-0   
II. Irani Shemirani, M. Ljungström, L. Epidemiology and  
antibiotic resistance patterns of Staphylococcus aureus  
strains in suspected sepsis patients in Skaraborg.  
(Submitted)  
III. Irani Shemirani, M., Pernestig, A.K., Björkman, J., Tilevik,  
D., von Mentzer, A., Ejdebäck, M., Ståhlberg, A.  
Identification of protein biomarkers to differentiate  
between gram-negative and gram-positive infections in  
adults suspected to sepsis. (Under Review)   
IV. Irani Shemirani, M. Transcriptional markers classifying  
Escherichia coli and Staphylococcus aureus induced sepsis  
in adults: a data-driven approach. PLOS ONE 2024, 19(7),    
DOI: 10.1371/journal.pone.0305920   
 
  
  
 
  
  
 
  
  
 
  
  
   
viii ix 
  
  
LIST OF PAPERS  ADDITIONAL PAPERS 
This thesis is based on the following studies, referred to in the text by their I. Irani Shemirani, M. Biomarkers approach in the diagnosis and 
roman numerals. prognosis of sepsis. Int. J. Public Health Res 2022, 12 (2).  
I. Shemirani, M.I., Tilevik, D., Tilevik, A., Jurcevic, S.,  
Arnellos, D., Enroth, H., Pernestig, A.K. Benchmarking of 
two bioinformatic workflows for the analysis of whole-  
genome sequenced Staphylococcus aureus collected from  
patients with suspected sepsis. BMC infect dis 2023, 23(1),  
39.   
DOI:10.1186/s12879-022-07977-0   
II. Irani Shemirani, M. Ljungström, L. Epidemiology and  
antibiotic resistance patterns of Staphylococcus aureus  
strains in suspected sepsis patients in Skaraborg.  
(Submitted)  
III. Irani Shemirani, M., Pernestig, A.K., Björkman, J., Tilevik,  
D., von Mentzer, A., Ejdebäck, M., Ståhlberg, A.  
Identification of protein biomarkers to differentiate  
between gram-negative and gram-positive infections in  
adults suspected to sepsis. (Under Review)   
IV. Irani Shemirani, M. Transcriptional markers classifying  
Escherichia coli and Staphylococcus aureus induced sepsis  
in adults: a data-driven approach. PLOS ONE 2024, 19(7),    
DOI: 10.1371/journal.pone.0305920   
 
  
  
 
  
  
 
  
  
 
  
  
   
viii ix 
  
  
CONTENT  
  
ABBREVIATIONS ............................................................................................ XII
1 INTRODUCTION ......................................................................................... 14
1.1 Definition ............................................................................................ 15
1.2 Epidemiology ...................................................................................... 17
1.3 Etiology ............................................................................................... 19
1.4 Pathogenesis ........................................................................................ 21
1.5 Diagnosis ............................................................................................. 23
2 AIM ........................................................................................................... 29
3 MATERIALS AND METHODS ..................................................................... 30
3.1 Subjects ............................................................................................... 31
3.2 Methods ............................................................................................... 33
3.3 Statistical analysis ............................................................................... 40
3.4 Ethical consideration ........................................................................... 41
4 RESULTS AND DISCUSSION ....................................................................... 42
4.1 Paper I- Benchmarking of two bioinformatics workflows .................. 43
4.2 Paper II- Epidemiology and antibiotic resistance pattern ................... 48
4.3 Paper III- Identifying a possible protein biomarker panel .................. 50
4.4 Paper IV- Transcriptomic markers ...................................................... 54
5 CONCLUSION ............................................................................................ 58
6 FUTURE PERSPECTIVES ............................................................................. 59
ACKNOWLEDGEMENT .................................................................................... 60
REFERENCES .................................................................................................. 62
 
  
x xi 
  
  
CONTENT  
  
ABBREVIATIONS ............................................................................................ XII
1 INTRODUCTION ......................................................................................... 14
1.1 Definition ............................................................................................ 15
1.2 Epidemiology ...................................................................................... 17
1.3 Etiology ............................................................................................... 19
1.4 Pathogenesis ........................................................................................ 21
1.5 Diagnosis ............................................................................................. 23
2 AIM ........................................................................................................... 29
3 MATERIALS AND METHODS ..................................................................... 30
3.1 Subjects ............................................................................................... 31
3.2 Methods ............................................................................................... 33
3.3 Statistical analysis ............................................................................... 40
3.4 Ethical consideration ........................................................................... 41
4 RESULTS AND DISCUSSION ....................................................................... 42
4.1 Paper I- Benchmarking of two bioinformatics workflows .................. 43
4.2 Paper II- Epidemiology and antibiotic resistance pattern ................... 48
4.3 Paper III- Identifying a possible protein biomarker panel .................. 50
4.4 Paper IV- Transcriptomic markers ...................................................... 54
5 CONCLUSION ............................................................................................ 58
6 FUTURE PERSPECTIVES ............................................................................. 59
ACKNOWLEDGEMENT .................................................................................... 60
REFERENCES .................................................................................................. 62
 
  
x xi 
  
  
ABBREVIATIONS MCODE Molecular Complex Detection 
ME Major error  
ADA  Adenosine Deaminase MFAP5 Microfibrillar-Associated Protein 5 
AST Antibiotic susceptibility test  MLST Multilocus sequence typing 
AUC-ROC Area Under the Receiver Operating Characteristic MNAR Missing Not At Random  
Curve  MRSA Methicillin-resistance Staphylococcus aureus  
CD8A T-Cell Surface Glycoprotein CD8 Alpha Chain MSE Mean square error  
CFHR5 Complement Factor H Related 5 NGS Next-generation sequencing 
CM Cardiometabolic NPX Normalized protein expression values  
CNDP1 Carnosine Dipeptidase 1 PAMPs Pathogen-associated molecular patterns  
CRP C-reactive protein  PANTHER Protein ANalysis THrough Evolutionary Relationships 
CSF-1 Colony Stimulating Factor 1 PCA Principal Component Analysis  
CVD II Cardiovascular II  PCC Pearson correlation coefficient 
DAMPs Damage-associated molecular patterns  PCR Polymerase chain reaction  
DIC Disseminated Intravascular Coagulation  PCT Procalcitonin  
E.coli Escherichia coli  PEA Proximity Extension Assay  
EHR Electronic health record  PPI Protein-protein interaction 
et(A,B) Exfoliative toxins (A, B) PRR Pathogen recognition receptors  
EUCAST European Committee on Antimicrobial Susceptibility PVL Panton-Valentine Leucocidin 
Testing  qSOFA Quick Sequential Organ Failure Assessment 
GDF2 Growth Differentiation Factor 2 RF Random forest  
ICD International Classification of Disease  RFE Recursive Feature Elimination 
ICU Intensive Care Unit  RLRs RIG-I-like receptors  
IFN-γ Interferon gamma SAA4 Serum Amyloid A4 
IHME Institute for Health Metrics and Evaluation S. argenteus Staphylococcus argenteus 
IL Interleukin S. aureus Staphylococcus aureus 
Inf Inflammation  S. epidermidis  Staphylococcus epidermidis 
INH In-house pipeline SIRS Systematic Inflammatory Response Syndrome 
IR Immune Response  SOFA Sequential Organ Failure Assessment 
LASSO Least Absolute Shrinkage and Selection Operator TLRs Toll-like receptors  
LBP Lipopolysaccharide binding protein  TNF Tumor Necrotizing Factor  
LOD Limit of detection  TNFRSF TNF Receptor Superfamily Member 
LPS Lipopolysaccharide  tSNE t-distributed stochastic neighbor embedding  
LR Logistic regression  tsst1 Toxic shock syndrometoxin-1 
LTA Lipoteichoic acid  VME Very major error   
MALDI-TOF MS Matrix-Assisted Laser Desorption/Ionization Time-of-
Flight Mass Spectrometry WGS Whole genome sequencing  
MBL2 Mannose Binding Lectin 2 WHO World Health Organization 
xii xiii 
  
  
ABBREVIATIONS MCODE Molecular Complex Detection 
ME Major error  
ADA  Adenosine Deaminase MFAP5 Microfibrillar-Associated Protein 5 
AST Antibiotic susceptibility test  MLST Multilocus sequence typing 
AUC-ROC Area Under the Receiver Operating Characteristic MNAR Missing Not At Random  
Curve  MRSA Methicillin-resistance Staphylococcus aureus  
CD8A T-Cell Surface Glycoprotein CD8 Alpha Chain MSE Mean square error  
CFHR5 Complement Factor H Related 5 NGS Next-generation sequencing 
CM Cardiometabolic NPX Normalized protein expression values  
CNDP1 Carnosine Dipeptidase 1 PAMPs Pathogen-associated molecular patterns  
CRP C-reactive protein  PANTHER Protein ANalysis THrough Evolutionary Relationships 
CSF-1 Colony Stimulating Factor 1 PCA Principal Component Analysis  
CVD II Cardiovascular II  PCC Pearson correlation coefficient 
DAMPs Damage-associated molecular patterns  PCR Polymerase chain reaction  
DIC Disseminated Intravascular Coagulation  PCT Procalcitonin  
E.coli Escherichia coli  PEA Proximity Extension Assay  
EHR Electronic health record  PPI Protein-protein interaction 
et(A,B) Exfoliative toxins (A, B) PRR Pathogen recognition receptors  
EUCAST European Committee on Antimicrobial Susceptibility PVL Panton-Valentine Leucocidin 
Testing  qSOFA Quick Sequential Organ Failure Assessment 
GDF2 Growth Differentiation Factor 2 RF Random forest  
ICD International Classification of Disease  RFE Recursive Feature Elimination 
ICU Intensive Care Unit  RLRs RIG-I-like receptors  
IFN-γ Interferon gamma SAA4 Serum Amyloid A4 
IHME Institute for Health Metrics and Evaluation S. argenteus Staphylococcus argenteus 
IL Interleukin S. aureus Staphylococcus aureus 
Inf Inflammation  S. epidermidis  Staphylococcus epidermidis 
INH In-house pipeline SIRS Systematic Inflammatory Response Syndrome 
IR Immune Response  SOFA Sequential Organ Failure Assessment 
LASSO Least Absolute Shrinkage and Selection Operator TLRs Toll-like receptors  
LBP Lipopolysaccharide binding protein  TNF Tumor Necrotizing Factor  
LOD Limit of detection  TNFRSF TNF Receptor Superfamily Member 
LPS Lipopolysaccharide  tSNE t-distributed stochastic neighbor embedding  
LR Logistic regression  tsst1 Toxic shock syndrometoxin-1 
LTA Lipoteichoic acid  VME Very major error   
MALDI-TOF MS Matrix-Assisted Laser Desorption/Ionization Time-of-
Flight Mass Spectrometry WGS Whole genome sequencing  
MBL2 Mannose Binding Lectin 2 WHO World Health Organization 
xii xiii 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1 INTRODUCTION 1.1 DEFINITION 
 
 
The word sepsis, σηψις, originated from the Greek word sepsin with the 
 meaning “decomposition” or “decay”. The first documented use of the word 
sepsis is found in Homer’s poems 2700 years ago as a derivative form of the 
 word sepo, σηπω, meaning “I rot”. For centuries, the term has been used by 
Hippocrates, Aristotle, Galen, and Plutarch as a clinical description of 
 systematic inflammation (1). Van Leeuwenhoek first observed living bacteria 
in 1674 (2). During the 19th and 20th centuries, sepsis was described by “Germ 
 theory” in which pathogenic microorganisms invade the bloodstream in such 
a way that it causes the onset of systemic infection symptoms (3). At this time 
the germ theory prevailed, despite the fact that a number of scientists, including 
Sir William Osler, declared the patient’s body response to the infection as the 
cause of death rather than the infection itself (4). Death of patients with sepsis, 
despite antibiotic treatment and pathogen eradication, along with experimental 
tests, highlighted the importance of the host’s immune response to sepsis 
manifestations in the 20th century.  
In 1991, during a conference of the American College of Chest Physicians and 
the Society of Critical Care Medicine (ACCP/SCCM), the first consensus 
definition of sepsis (sepsis-1) was established with the aim of improving 
clinical diagnosis and standardizing research protocols. ACCP/SCCM 
introduced criteria for systematic inflammatory response syndrome (SIRS) and 
defined sepsis as the presence of at least two of the SIRS criteria as a result of 
infection (Figure 1) (5). In sepsis-1 definition, a degree of clinical stages for 
sepsis was taken into consideration. Severe sepsis was defined by 
accompanying organ dysfunction, hypoperfusion, or hypotension with sepsis, 
and septic shock was clinically defined by hypotension resistance to fluid and 
vasopressor therapy (5).  
Figure 1. Stages of sepsis according to Sepsis-1 definition. Sepsis progresses from a systemic 
inflammatory response to infection (sepsis), to severe sepsis (sepsis with organ dysfunction), 
and finally to septic shock (severe sepsis with persistent hypotension despite fluid resuscitation. 
The clinical picture of SIRS considered in this definition is nonspecific and 
manifests in many conditions. Conversely, inflammation is a generic response 
to any stimuli from minor trauma to autoimmune disease. Due to the 
14 15 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1 INTRODUCTION 1.1 DEFINITION 
 
 
The word sepsis, σηψις, originated from the Greek word sepsin with the 
 meaning “decomposition” or “decay”. The first documented use of the word 
sepsis is found in Homer’s poems 2700 years ago as a derivative form of the 
 word sepo, σηπω, meaning “I rot”. For centuries, the term has been used by 
Hippocrates, Aristotle, Galen, and Plutarch as a clinical description of 
 systematic inflammation (1). Van Leeuwenhoek first observed living bacteria 
in 1674 (2). During the 19th and 20th centuries, sepsis was described by “Germ 
 theory” in which pathogenic microorganisms invade the bloodstream in such 
a way that it causes the onset of systemic infection symptoms (3). At this time 
the germ theory prevailed, despite the fact that a number of scientists, including 
Sir William Osler, declared the patient’s body response to the infection as the 
cause of death rather than the infection itself (4). Death of patients with sepsis, 
despite antibiotic treatment and pathogen eradication, along with experimental 
tests, highlighted the importance of the host’s immune response to sepsis 
manifestations in the 20th century.  
In 1991, during a conference of the American College of Chest Physicians and 
the Society of Critical Care Medicine (ACCP/SCCM), the first consensus 
definition of sepsis (sepsis-1) was established with the aim of improving 
clinical diagnosis and standardizing research protocols. ACCP/SCCM 
introduced criteria for systematic inflammatory response syndrome (SIRS) and 
defined sepsis as the presence of at least two of the SIRS criteria as a result of 
infection (Figure 1) (5). In sepsis-1 definition, a degree of clinical stages for 
sepsis was taken into consideration. Severe sepsis was defined by 
accompanying organ dysfunction, hypoperfusion, or hypotension with sepsis, 
and septic shock was clinically defined by hypotension resistance to fluid and 
vasopressor therapy (5).  
Figure 1. Stages of sepsis according to Sepsis-1 definition. Sepsis progresses from a systemic 
inflammatory response to infection (sepsis), to severe sepsis (sepsis with organ dysfunction), 
and finally to septic shock (severe sepsis with persistent hypotension despite fluid resuscitation. 
The clinical picture of SIRS considered in this definition is nonspecific and 
manifests in many conditions. Conversely, inflammation is a generic response 
to any stimuli from minor trauma to autoimmune disease. Due to the 
14 15 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
limitations, the list of diagnostics criteria was expanded in 2001 during the 1.2 EPIDEMIOLOGY 
second consensus conference (sepsis-2), incorporating organ dysfunction 
criteria for diagnosing of sepsis, while the definition of severe sepsis remained  
unchanged. This caused confusion among researchers and clinicians when 
distinguishing between ‘sepsis’ under the new criteria and ‘severe sepsis’ Sepsis is a critical global health issue, marked by significant morbidity and 
under the old criteria (5).  mortality. Historically, sepsis has been linked to major epidemics, such as the 
Yellow Fever outbreak in Philadelphia (1793), the Ebola epidemic in West 
With deepened knowledge of the pathophysiology of sepsis, the need to change Africa (2013-2016), and the COVID-19 pandemic (2019-2022). The definition 
the definition of sepsis was recognized. In 2016, a new sepsis definition and diagnostic criteria for sepsis have evolved over time (see Chapter 1.1, 
(sepsis-3) was proposed in which sepsis was defined as a “life-threatening leading to variations in reporting and estimation across different countries, 
organ dysfunction caused by a dysregulated host response to infection” and which complicates accurate epidemiological assessments. 
sepsis shock as a “subset of sepsis in which underline circulatory, cellular and 
metabolic abnormalities are profound enough to substantially increase A study by the Institute for Health Metrics and Evaluation (IHME) in 2020 
mortality” (6). Thereby, sepsis and severe sepsis were used interchangeably to estimated 49 million global cases of sepsis annually, with 11 million sepsis-
resolve the confusion in the old definition, and organ dysfunction must be related deaths based on death certificates. This study, covering 195 countries 
included in the clinical diagnosis of sepsis.  and 282 causes of death, reported a 37% decrease in global sepsis incidence 
and a 31% reduction in sepsis-related deaths from 1990 to 2017 (9). However, 
In Sweden, a working group comprised of representatives from the Swedish these estimates, based primarily on death certificates, may not fully capture 
Society of Infectious Diseases and the Swedish Society of Intensive Care sepsis or related organ dysfunction (9). 
Medicine published a consensus document on the definition and criteria for 
severe sepsis and septic shock in 2011. With few exceptions, the consensus In most studies, sepsis epidemiology is studied using administrative hospital 
was drafted using the definitions and criteria from Sepsis-1 (1991) and Sepsis- discharge data, identified through International Classification of Diseases 
2 (2001). The key distinction was in the definition of severe sepsis. Severe (ICD) codes 9 and 10. Fleischmann-Struzek et al. (2020) recently found that 
sepsis was characterized in this definition as a proven infection with organ ICD-based estimates of hospital-treated sepsis were approximately 50% lower 
dysfunction (7). This definition was used to classify patients with severe sepsis than those from IHME (10). They reported a global incidence of 189 cases per 
until 2016 when the new definition by the third international consensus 100,000 person-years and a 26.7% mortality rate for hospital-treated sepsis, 
(Sepsis-3) integrated severe sepsis within the definition of sepsis. A national with ICU-treated sepsis having an incidence of 58 per 100,000 person-years 
consensus group was formed in the autumn of 2016 on behalf of the Swedish and a mortality rate of 41.9% (10). Despite its global standardization by the 
Association of Infectious Disease Physicians (SILF), the Swedish Society of World Health Organization (WHO), ICD estimates can be skewed by 
Acute Care (SWESEM), the Swedish Society of Anesthesiology and Intensive variations in ICD revisions and local modifications (9), as well as factors like 
Care (SFAI), and the Swedish Intensive Care Register (SIR), as well as a diagnosis accuracy, infection misclassification, documentation quality, and 
representative from the National Institute of Health and Welfare, department reimbursement incentives (11). 
of disease classification. This group’s mission was to examine Sepsis-3 and 
decide how it ought to be applied in Swedish healthcare. The consensus group The growing use of electronic health record (EHR) systems allows for the 
recommended that the definitions of sepsis and septic shock proposed in investigation sepsis epidemiology using clinical criteria instead of ICD data. 
Sepsis-3 should replace the previous Swedish definitions. It was also Rhee et al. (2017) analyzed EHR records of over 2.9 million adults hospitalized 
recommended that the new international criteria be used when diagnosing and across 409 US hospitals between 2009 and 2014, finding a 6%  incidence of 
classifying sepsis and septic shock (8). sepsis (12). Their EHR-based approach demonstrated a sensitivity of 70% and 
a comparable positive predictive value of 70.4% to ICD-based approach. 
While EHR data showed constant sepsis prevalence and mortality rate, ICD 
data indicated an annual increase in prevalence (+10.3% [95% CI, 7.2% to 
13.3%], p < .001) and a decrease in mortality (−7.0% [95% CI, −8.8% to 
16 17 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
limitations, the list of diagnostics criteria was expanded in 2001 during the 1.2 EPIDEMIOLOGY 
second consensus conference (sepsis-2), incorporating organ dysfunction 
criteria for diagnosing of sepsis, while the definition of severe sepsis remained  
unchanged. This caused confusion among researchers and clinicians when 
distinguishing between ‘sepsis’ under the new criteria and ‘severe sepsis’ Sepsis is a critical global health issue, marked by significant morbidity and 
under the old criteria (5).  mortality. Historically, sepsis has been linked to major epidemics, such as the 
Yellow Fever outbreak in Philadelphia (1793), the Ebola epidemic in West 
With deepened knowledge of the pathophysiology of sepsis, the need to change Africa (2013-2016), and the COVID-19 pandemic (2019-2022). The definition 
the definition of sepsis was recognized. In 2016, a new sepsis definition and diagnostic criteria for sepsis have evolved over time (see Chapter 1.1, 
(sepsis-3) was proposed in which sepsis was defined as a “life-threatening leading to variations in reporting and estimation across different countries, 
organ dysfunction caused by a dysregulated host response to infection” and which complicates accurate epidemiological assessments. 
sepsis shock as a “subset of sepsis in which underline circulatory, cellular and 
metabolic abnormalities are profound enough to substantially increase A study by the Institute for Health Metrics and Evaluation (IHME) in 2020 
mortality” (6). Thereby, sepsis and severe sepsis were used interchangeably to estimated 49 million global cases of sepsis annually, with 11 million sepsis-
resolve the confusion in the old definition, and organ dysfunction must be related deaths based on death certificates. This study, covering 195 countries 
included in the clinical diagnosis of sepsis.  and 282 causes of death, reported a 37% decrease in global sepsis incidence 
and a 31% reduction in sepsis-related deaths from 1990 to 2017 (9). However, 
In Sweden, a working group comprised of representatives from the Swedish these estimates, based primarily on death certificates, may not fully capture 
Society of Infectious Diseases and the Swedish Society of Intensive Care sepsis or related organ dysfunction (9). 
Medicine published a consensus document on the definition and criteria for 
severe sepsis and septic shock in 2011. With few exceptions, the consensus In most studies, sepsis epidemiology is studied using administrative hospital 
was drafted using the definitions and criteria from Sepsis-1 (1991) and Sepsis- discharge data, identified through International Classification of Diseases 
2 (2001). The key distinction was in the definition of severe sepsis. Severe (ICD) codes 9 and 10. Fleischmann-Struzek et al. (2020) recently found that 
sepsis was characterized in this definition as a proven infection with organ ICD-based estimates of hospital-treated sepsis were approximately 50% lower 
dysfunction (7). This definition was used to classify patients with severe sepsis than those from IHME (10). They reported a global incidence of 189 cases per 
until 2016 when the new definition by the third international consensus 100,000 person-years and a 26.7% mortality rate for hospital-treated sepsis, 
(Sepsis-3) integrated severe sepsis within the definition of sepsis. A national with ICU-treated sepsis having an incidence of 58 per 100,000 person-years 
consensus group was formed in the autumn of 2016 on behalf of the Swedish and a mortality rate of 41.9% (10). Despite its global standardization by the 
Association of Infectious Disease Physicians (SILF), the Swedish Society of World Health Organization (WHO), ICD estimates can be skewed by 
Acute Care (SWESEM), the Swedish Society of Anesthesiology and Intensive variations in ICD revisions and local modifications (9), as well as factors like 
Care (SFAI), and the Swedish Intensive Care Register (SIR), as well as a diagnosis accuracy, infection misclassification, documentation quality, and 
representative from the National Institute of Health and Welfare, department reimbursement incentives (11). 
of disease classification. This group’s mission was to examine Sepsis-3 and 
decide how it ought to be applied in Swedish healthcare. The consensus group The growing use of electronic health record (EHR) systems allows for the 
recommended that the definitions of sepsis and septic shock proposed in investigation sepsis epidemiology using clinical criteria instead of ICD data. 
Sepsis-3 should replace the previous Swedish definitions. It was also Rhee et al. (2017) analyzed EHR records of over 2.9 million adults hospitalized 
recommended that the new international criteria be used when diagnosing and across 409 US hospitals between 2009 and 2014, finding a 6%  incidence of 
classifying sepsis and septic shock (8). sepsis (12). Their EHR-based approach demonstrated a sensitivity of 70% and 
a comparable positive predictive value of 70.4% to ICD-based approach. 
While EHR data showed constant sepsis prevalence and mortality rate, ICD 
data indicated an annual increase in prevalence (+10.3% [95% CI, 7.2% to 
13.3%], p < .001) and a decrease in mortality (−7.0% [95% CI, −8.8% to 
16 17 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
−5.2%], p < .001). This suggests that ICD-based estimates may be affected by 1.3  ETIOLOGY 
clinical awareness and coding practices, while EHR data offers a more 
objective measure of sepsis.      
In Sweden, in 2020, a multicenter study reported an incidence rate of 81 ICU- Humans are always faced with the threat of pathogenic microorganisms. Our 
treated sepsis cases per 100,000 persons and an in-hospital mortality rate of survival depends on the innate and adaptive immune systems. The body’s 
26%, based on sepsis-3 criteria (13). The study highlighted a significant primary defenses against infection, including the skin, enzymes, and mucus, 
discrepancy between clinical data and ICD discharge codes, with only one- can be compromised, allowing microorganisms to invade and potentially cause 
third of sepsis patients coded for the condition upon ICU discharge (13). This sepsis (14).  
discrepancy again underscores the influence of documentation and coding 
quality on the accuracy of ICD-based sepsis estimates. Gram-negative and gram-positive bacterial infections are the primary causes 
of sepsis; however, viruses, parasites, and fungi also contribute significantly to 
 sepsis, especially among immunocompromised patients and those with other 
co-morbidities (15, 16). Currently, gram-negative bacteria constitute 62.2% of 
 patients with positive blood cultures and gram-positive bacteria establish 
46.8% of sepsis cases (15). Commonly implicated gram-negative bacteria 
 include Haemophilus influenzae, Escherichia coli (E. coli), Salmonella 
 spp., and Neisseria meningitides. Among gram-positive bacteria causing 
sepsis, the most common contributors are Streptococcus pneumoniae, and 
Staphylococcus aureus (S. aureus), especially methicillin-resistance 
Staphylococcus aureus (MRSA) (15, 16).  
Pneumonia is the most common infectious disease among patients who die 
from sepsis as an immediate cause of death, followed by intra-abdominal 
infections and intravascular infections (17). However, urinary tract, skin, bone, 
and brain infections (such as meningitis) can also develop into sepsis. These 
infections are often localized and controlled by the host immune system. Sepsis 
often progresses when the host is unable to suppress the primary infection due 
to factors such as a high bacterial load, the presence of virulence factors, or 
defects in the immune system.  
The virulence mechanisms of bacteria vary among different species and 
strains, significantly impacting the progression and severity of sepsis (18). For 
instance, in gram-negative bacteria like E. coli endotoxins such as 
lipopolysaccharides (LPS) from the bacterial cell wall can trigger intense 
inflammatory responses that contribute to sepsis (19, 20) (Figure 2). In 
contrast, gram-positive bacteria like S. aureus produce exotoxins, such as toxic 
shock syndrome toxin (TSST), which can also play a role in sepsis by 
amplifying the inflammatory response and causing tissue damage (21, 22). 
18 19 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
−5.2%], p < .001). This suggests that ICD-based estimates may be affected by 1.3  ETIOLOGY 
clinical awareness and coding practices, while EHR data offers a more 
objective measure of sepsis.      
In Sweden, in 2020, a multicenter study reported an incidence rate of 81 ICU- Humans are always faced with the threat of pathogenic microorganisms. Our 
treated sepsis cases per 100,000 persons and an in-hospital mortality rate of survival depends on the innate and adaptive immune systems. The body’s 
26%, based on sepsis-3 criteria (13). The study highlighted a significant primary defenses against infection, including the skin, enzymes, and mucus, 
discrepancy between clinical data and ICD discharge codes, with only one- can be compromised, allowing microorganisms to invade and potentially cause 
third of sepsis patients coded for the condition upon ICU discharge (13). This sepsis (14).  
discrepancy again underscores the influence of documentation and coding 
quality on the accuracy of ICD-based sepsis estimates. Gram-negative and gram-positive bacterial infections are the primary causes 
of sepsis; however, viruses, parasites, and fungi also contribute significantly to 
 sepsis, especially among immunocompromised patients and those with other 
co-morbidities (15, 16). Currently, gram-negative bacteria constitute 62.2% of 
 patients with positive blood cultures and gram-positive bacteria establish 
46.8% of sepsis cases (15). Commonly implicated gram-negative bacteria 
 include Haemophilus influenzae, Escherichia coli (E. coli), Salmonella 
 spp., and Neisseria meningitides. Among gram-positive bacteria causing 
sepsis, the most common contributors are Streptococcus pneumoniae, and 
Staphylococcus aureus (S. aureus), especially methicillin-resistance 
Staphylococcus aureus (MRSA) (15, 16).  
Pneumonia is the most common infectious disease among patients who die 
from sepsis as an immediate cause of death, followed by intra-abdominal 
infections and intravascular infections (17). However, urinary tract, skin, bone, 
and brain infections (such as meningitis) can also develop into sepsis. These 
infections are often localized and controlled by the host immune system. Sepsis 
often progresses when the host is unable to suppress the primary infection due 
to factors such as a high bacterial load, the presence of virulence factors, or 
defects in the immune system.  
The virulence mechanisms of bacteria vary among different species and 
strains, significantly impacting the progression and severity of sepsis (18). For 
instance, in gram-negative bacteria like E. coli endotoxins such as 
lipopolysaccharides (LPS) from the bacterial cell wall can trigger intense 
inflammatory responses that contribute to sepsis (19, 20) (Figure 2). In 
contrast, gram-positive bacteria like S. aureus produce exotoxins, such as toxic 
shock syndrome toxin (TSST), which can also play a role in sepsis by 
amplifying the inflammatory response and causing tissue damage (21, 22). 
18 19 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1.4 PATHOGENESIS 
 
Dysregulation of complex processes, including the innate and adaptive 
immune response, complement activation, the coagulation cascade, and the 
endothelial vascular system, contributes to the development of sepsis (24, 25).  
 In gram-negative bacterial infection, the pathogen is recognized by the innate 
Figure 2. Comparison of gram-negative and gram-positive bacterial cell walls. The cell wall of immune system through the interaction of pathogen recognition receptors 
gram-negative bacteria possesses a thin peptidoglycan layer, along with an additional outer (PRR), with exogenous pathogen-associated molecular patterns (PAMPs), 
layer made up of lipopolysaccharides. In contrast, gram-positive bacteria feature a thick and endogenous damage-associated molecular patterns (DAMPs) (26, 27). 
peptidoglycan layer. Adapted from Atanasova KR (23). Toll-like receptors (TLRs), C-type lectin receptors (CLRs), RIG-I-like 
receptors (RLRs), NOD-like receptors (NLRs), and AIM2-like Receptors 
 (ALRs) are the five types of PRRs that have been discovered thus far (26, 27). 
Some people are at higher risk of sepsis. Sepsis is more common among 
the elderly population and infants less than three months old (15). Diabetes, LPS, a component of bacterial cell walls also known as an endotoxin, is the 
cardiovascular diseases, steroid treatment, organ transplantation, cancer, most frequent factor of gram-negative bacterial sepsis. This unique gram-
and chronic obstructive pulmonary disease (COPD) also increase negative bacteria compound is released by bacterial lysis and consists of three 
susceptibility to bacterial infections that may develop into sepsis. Here, a components. The outer domain is known as the O-antigen, and its chain 
compromised immune system is the main factor in the progression of sepsis composition varies from strain to strain, resulting in different antigen effects. 
(15). The middle layer is known as the “core”, and it is a less diverse oligosaccharide 
 domain. The third layer is a conserved hydrophobic inner domain known as 
 lipid A (or endotoxin) (28). Lipid A is a prime example of a PAMP and an 
 immune system inducer. Bacterial endotoxin binds to a lipopolysaccharide 
 binding protein (LBP) resulting in activation of macrophages and initiating 
 coagulation cascade (28, 29). Similarly, lipoteichoic acid (LTA) released in 
 gram-positive infections affects macrophage function and results in the 
 production of mediators. Both toxins stimulate macrophage CD14, TLR4, and 
 TLR2 receptors, resulting in the release of cytokine mediators (28) which are 
 necessary for immune reactions that may develop into sepsis and septic shock 
 (Figure 3). 
 
 Gram-positive bacteria with exotoxin release can occasionally cause sepsis and 
 septic shock as well. Superantigens are the most common exotoxin produced 
 by gram-positive bacteria. The superantigens activate T-lymphocytes and 
 trigger the coagulation cascade, which results in the production of Interferon 
 gamma (IFN-γ) and Interleukin-2 (IL-2). Both IFN-γ and IL-2 stimulate 
 macrophages to produce IL-1 and Tumor Necrosis Factor alpha (TNF-α). 
20 21 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1.4 PATHOGENESIS 
 
Dysregulation of complex processes, including the innate and adaptive 
immune response, complement activation, the coagulation cascade, and the 
endothelial vascular system, contributes to the development of sepsis (24, 25).  
 In gram-negative bacterial infection, the pathogen is recognized by the innate 
Figure 2. Comparison of gram-negative and gram-positive bacterial cell walls. The cell wall of immune system through the interaction of pathogen recognition receptors 
gram-negative bacteria possesses a thin peptidoglycan layer, along with an additional outer (PRR), with exogenous pathogen-associated molecular patterns (PAMPs), 
layer made up of lipopolysaccharides. In contrast, gram-positive bacteria feature a thick and endogenous damage-associated molecular patterns (DAMPs) (26, 27). 
peptidoglycan layer. Adapted from Atanasova KR (23). Toll-like receptors (TLRs), C-type lectin receptors (CLRs), RIG-I-like 
receptors (RLRs), NOD-like receptors (NLRs), and AIM2-like Receptors 
 (ALRs) are the five types of PRRs that have been discovered thus far (26, 27). 
Some people are at higher risk of sepsis. Sepsis is more common among 
the elderly population and infants less than three months old (15). Diabetes, LPS, a component of bacterial cell walls also known as an endotoxin, is the 
cardiovascular diseases, steroid treatment, organ transplantation, cancer, most frequent factor of gram-negative bacterial sepsis. This unique gram-
and chronic obstructive pulmonary disease (COPD) also increase negative bacteria compound is released by bacterial lysis and consists of three 
susceptibility to bacterial infections that may develop into sepsis. Here, a components. The outer domain is known as the O-antigen, and its chain 
compromised immune system is the main factor in the progression of sepsis composition varies from strain to strain, resulting in different antigen effects. 
(15). The middle layer is known as the “core”, and it is a less diverse oligosaccharide 
 domain. The third layer is a conserved hydrophobic inner domain known as 
 lipid A (or endotoxin) (28). Lipid A is a prime example of a PAMP and an 
 immune system inducer. Bacterial endotoxin binds to a lipopolysaccharide 
 binding protein (LBP) resulting in activation of macrophages and initiating 
 coagulation cascade (28, 29). Similarly, lipoteichoic acid (LTA) released in 
 gram-positive infections affects macrophage function and results in the 
 production of mediators. Both toxins stimulate macrophage CD14, TLR4, and 
 TLR2 receptors, resulting in the release of cytokine mediators (28) which are 
 necessary for immune reactions that may develop into sepsis and septic shock 
 (Figure 3). 
 
 Gram-positive bacteria with exotoxin release can occasionally cause sepsis and 
 septic shock as well. Superantigens are the most common exotoxin produced 
 by gram-positive bacteria. The superantigens activate T-lymphocytes and 
 trigger the coagulation cascade, which results in the production of Interferon 
 gamma (IFN-γ) and Interleukin-2 (IL-2). Both IFN-γ and IL-2 stimulate 
 macrophages to produce IL-1 and Tumor Necrosis Factor alpha (TNF-α). 
20 21 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
These mediators play a crucial role in triggering the body’s inflammatory 1.5 DIAGNOSIS 
response to infections (3, 28-30).  
 
Sepsis is challenging to diagnose because the symptoms and signs may overlap 
with those of other conditions. To accurately diagnose an underlying infection, 
clinicians often recommend a series of tests. The diagnosis of sepsis relies on 
evaluating clinical symptoms, analyzing blood biomarkers, reviewing 
microbiological results, and conducting imaging studies. 
Figure 3. Activation of coagulation cascade by Lipopolysaccharide (LPS). LPS stimulates TLR4 
and CD14 receptors on macrophages either by direct binding to the receptors or, more 
commonly, by being transferred to the receptors via LPS-binding protein in the serum. Adopted 
from Raetz CR, Whitfield C (31). 
In practice, it is the complex interaction of pro-inflammatory and anti-
inflammatory mediators that results in sepsis. TNF-α is the main pro-
inflammatory mediator responsible for the onset of sepsis (25, 28, 32). Pro-
inflammatory cytokines also stimulate neutrophils, platelets, lymphocytes, 
liver cells, endothelial cells, and macrophages, which can result in tissue 
damage, vascular dilation, and lung dysfunction (3). The activation of the 
complement system, which can result in the production of anaphylatoxins and 
a severe inflammatory response, causes other signs and symptoms of sepsis 
such as increased levels of C-reactive protein (CRP), inhibition of fibrinolysis, 
and, eventually, Disseminated Intravascular Coagulation (DIC). DIC 
frequently occurs as a result of infection by gram-negative bacteria and may 
lead to homeostasis imbalance and organ dysfunction (26, 33).  
Figure 4. Overview of sepsis diagnosis. Combination of clinical symptoms, blood biomarkers, 
microbiological findings, and imaging assessment use to identify the source of infection and 
assess organ involvement. 
22 23 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
These mediators play a crucial role in triggering the body’s inflammatory 1.5 DIAGNOSIS 
response to infections (3, 28-30).  
 
Sepsis is challenging to diagnose because the symptoms and signs may overlap 
with those of other conditions. To accurately diagnose an underlying infection, 
clinicians often recommend a series of tests. The diagnosis of sepsis relies on 
evaluating clinical symptoms, analyzing blood biomarkers, reviewing 
microbiological results, and conducting imaging studies. 
Figure 3. Activation of coagulation cascade by Lipopolysaccharide (LPS). LPS stimulates TLR4 
and CD14 receptors on macrophages either by direct binding to the receptors or, more 
commonly, by being transferred to the receptors via LPS-binding protein in the serum. Adopted 
from Raetz CR, Whitfield C (31). 
In practice, it is the complex interaction of pro-inflammatory and anti-
inflammatory mediators that results in sepsis. TNF-α is the main pro-
inflammatory mediator responsible for the onset of sepsis (25, 28, 32). Pro-
inflammatory cytokines also stimulate neutrophils, platelets, lymphocytes, 
liver cells, endothelial cells, and macrophages, which can result in tissue 
damage, vascular dilation, and lung dysfunction (3). The activation of the 
complement system, which can result in the production of anaphylatoxins and 
a severe inflammatory response, causes other signs and symptoms of sepsis 
such as increased levels of C-reactive protein (CRP), inhibition of fibrinolysis, 
and, eventually, Disseminated Intravascular Coagulation (DIC). DIC 
frequently occurs as a result of infection by gram-negative bacteria and may 
lead to homeostasis imbalance and organ dysfunction (26, 33).  
Figure 4. Overview of sepsis diagnosis. Combination of clinical symptoms, blood biomarkers, 
microbiological findings, and imaging assessment use to identify the source of infection and 
assess organ involvement. 
22 23 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1.5.1 CLINICAL DIAGNOSIS 1.5.2 BACTERIA IDENTIFICATION 
  
Physiological scoring systems, such as the SOFA score, are used to assess the Microbiological diagnosis is a complex process involving two main 
extent of a patient’s organ function or dysfunction (Table 1). Developed in the methodologies for detecting and identifying bacteria in patients: culture-
early 1990s, the SOFA score has been implemented in ICU monitoring for dependent methods and culture-independent methods. 
critically ill patients and has become essential with the adoption of the new 
sepsis definition in 2016 for sepsis diagnosis. According to the Sepsis-3 Culture-dependent methods 
criteria, an increase in the SOFA score by 2 or more points indicates organ Blood culture is still the most common method for confirming bacterial 
dysfunction, and when combined with a suspected or documented infection, it infection in clinical practice, though it detects bacteremia only in about 50% 
defines sepsis (6).  of patients who clinically suffer from sepsis (34). The positivity rate will be 
lowered with the administration of antibiotics before drawing a blood sample 
Table 1. Sequential Organ Failure Assessment (SOFA) scoring system* (35). Rapid diagnostic technology such as Matrix-Assisted Laser 
Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS), 
Indicator/Score 0 1 2 3 4 can assists by identifying pathogens directly from positive blood cultures. 
PaO2/FIO2,mm ≥400 <400 <300 <200 with <100 with MALDI-TOF MS can determine the type of pathogen within 2-6 hours, 
Hg respiratory respiratory significantly reducing the time needed to identify the appropriate antibiotic 
support support therapy and thereby improving patient outcomes (36).  
Platelets, ≥150 <150 <100 <50 <20 
x103/μL 
Bilirubin, mg/dl  <1.2 1.2-1.9 2.0-5.9 6.0-11.9 >12.0 Culture-independent methods Due to the challenges associated with blood culture methods—such as their 
Hypertension MAP MAP <70 Dopamine Dopamine Dopamine low sensitivity, long turnaround times, and risk of contamination—there has 
≥70 mm Hg  <5 or 5.1-15 or >15 or been a push towards alternative, culture-independent techniques (37). 
mm dobutamin epinephrine epinephrine Polymerase chain reaction (PCR) has been used to diagnose bloodstream 
Hg  e (any ≤0.1 or >0.1 or 
a infections directly from blood samples (38). However, PCR lacks the dose)  norepinephrin norepinephrin
e ≤0.1 a e >0.1 a capability to identify bacterial pathogenicity.  
GCS score b  15 13-14 10-12 6-9 <6 Recently, nucleic acid sequencing technologies have gained prominence as 
Creatinine, <1.2 1.2-1.9 2.0-3.4 3.5-4.9 >5.0 they address the limitations of both traditional blood cultures and PCR 
mg/dL  methods. Sequencing technologies, such as next-generation sequencing 
Urine output,    <500 <200 (NGS), provide a comprehensive view of the microbial community by 
mL/d 
 analyzing the entire DNA or RNA present in a sample. This allows for the *Adapted from Singer et al. (6)
PaO2, partial pressure of oxygen, FIO2, the fraction of inspired oxygen; MAP, mean arterial identification of a broad range of pathogens and the detection of antibiotic 
pressure resistance genes with high precision. Moreover, sequencing can reveal detailed 
a Catecholamine doses are given as μg/kg body weight/min for at least 1 hour. genetic information about pathogens, including their virulence factors and 
b GCS (Glasgow Coma Scale) scores range from 3-15; a higher score indicates better resistance profiles (39-42). This advancement promises not only faster and 
neurological function. more accurate diagnoses but also deeper insights into the underlying 
 mechanisms of infection. As sequencing technology continues to evolve, its 
 integration into clinical workflows could significantly enhance the management of sepsis and other infections by providing real-time data and 
supporting more targeted treatment strategies. 
24 25 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1.5.1 CLINICAL DIAGNOSIS 1.5.2 BACTERIA IDENTIFICATION 
  
Physiological scoring systems, such as the SOFA score, are used to assess the Microbiological diagnosis is a complex process involving two main 
extent of a patient’s organ function or dysfunction (Table 1). Developed in the methodologies for detecting and identifying bacteria in patients: culture-
early 1990s, the SOFA score has been implemented in ICU monitoring for dependent methods and culture-independent methods. 
critically ill patients and has become essential with the adoption of the new 
sepsis definition in 2016 for sepsis diagnosis. According to the Sepsis-3 Culture-dependent methods 
criteria, an increase in the SOFA score by 2 or more points indicates organ Blood culture is still the most common method for confirming bacterial 
dysfunction, and when combined with a suspected or documented infection, it infection in clinical practice, though it detects bacteremia only in about 50% 
defines sepsis (6).  of patients who clinically suffer from sepsis (34). The positivity rate will be 
lowered with the administration of antibiotics before drawing a blood sample 
Table 1. Sequential Organ Failure Assessment (SOFA) scoring system* (35). Rapid diagnostic technology such as Matrix-Assisted Laser 
Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI-TOF MS), 
Indicator/Score 0 1 2 3 4 can assists by identifying pathogens directly from positive blood cultures. 
PaO2/FIO2,mm ≥400 <400 <300 <200 with <100 with MALDI-TOF MS can determine the type of pathogen within 2-6 hours, 
Hg respiratory respiratory significantly reducing the time needed to identify the appropriate antibiotic 
support support therapy and thereby improving patient outcomes (36).  
Platelets, ≥150 <150 <100 <50 <20 
x103/μL 
Bilirubin, mg/dl  <1.2 1.2-1.9 2.0-5.9 6.0-11.9 >12.0 Culture-independent methods Due to the challenges associated with blood culture methods—such as their 
Hypertension MAP MAP <70 Dopamine Dopamine Dopamine low sensitivity, long turnaround times, and risk of contamination—there has 
≥70 mm Hg  <5 or 5.1-15 or >15 or been a push towards alternative, culture-independent techniques (37). 
mm dobutamin epinephrine epinephrine Polymerase chain reaction (PCR) has been used to diagnose bloodstream 
Hg  e (any ≤0.1 or >0.1 or 
a infections directly from blood samples (38). However, PCR lacks the dose)  norepinephrin norepinephrin
e ≤0.1 a e >0.1 a capability to identify bacterial pathogenicity.  
GCS score b  15 13-14 10-12 6-9 <6 Recently, nucleic acid sequencing technologies have gained prominence as 
Creatinine, <1.2 1.2-1.9 2.0-3.4 3.5-4.9 >5.0 they address the limitations of both traditional blood cultures and PCR 
mg/dL  methods. Sequencing technologies, such as next-generation sequencing 
Urine output,    <500 <200 (NGS), provide a comprehensive view of the microbial community by 
mL/d 
 analyzing the entire DNA or RNA present in a sample. This allows for the *Adapted from Singer et al. (6)
PaO2, partial pressure of oxygen, FIO2, the fraction of inspired oxygen; MAP, mean arterial identification of a broad range of pathogens and the detection of antibiotic 
pressure resistance genes with high precision. Moreover, sequencing can reveal detailed 
a Catecholamine doses are given as μg/kg body weight/min for at least 1 hour. genetic information about pathogens, including their virulence factors and 
b GCS (Glasgow Coma Scale) scores range from 3-15; a higher score indicates better resistance profiles (39-42). This advancement promises not only faster and 
neurological function. more accurate diagnoses but also deeper insights into the underlying 
 mechanisms of infection. As sequencing technology continues to evolve, its 
 integration into clinical workflows could significantly enhance the management of sepsis and other infections by providing real-time data and 
supporting more targeted treatment strategies. 
24 25 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1.5.3 BLOOD BIOMARKERS  For years, researchers have examined blood levels of immune system proteins 
 to identify sepsis. Despite numerous studies, sepsis-specific biomarkers remain 
elusive. Over 250 biomarkers have been proposed for sepsis diagnosis, but 
A biomarker is defined as “a characteristic that is measured as an indicator of only a few are used in clinical practice, and they often suffer from low 
natural biological processes, pathogenic processes, or response to exposure or sensitivity and specificity (26, 44). Nevertheless, these biomarkers continue to 
intervention” (43). Biomarkers have a wide range of potential applications, be valuable for predicting sepsis risk, stratifying patients, evaluating treatment 
including disease diagnosis, prognosis, and treatment monitoring, as well as in efficacy, and assessing prognosis (26, 45, 46). Below, some of the most 
research and drug development (see Table 2 for more details). researched protein biomarkers for sepsis are described. 
Table 2. Application of biomarkers* Acute phase reactants  
Peripheral blood contains acute phase reactants like CRP and Procalcitonin 
Diagnostic To identify a person who has a particular disease. (PCT), both of which rise rapidly in septic patients. These markers are useful 
Monitoring To examine serially the status of a disease or medical indicators of inflammation and infection and are often used in conjunction with 
condition for signs of exposure to a medical product or other variables to diagnose sepsis and assess the effectiveness of treatment (26, 
environmental agent, or a biological agent’s or medical 45-47).  
product’s effects.  Cytokines  
Pharmacodynamic To demonstrate how the body responds to a medication Cytokines are signaling molecules released by various cell types, including 
or environmental factor. immune cells like monocytes, macrophages, and lymphocytes, as well as 
Predictive To predict whether an individual or a group of endothelial cells, fibroblasts, and stromal cells (26). Among the most 
individuals is more likely to experience a favorable or investigated pro-inflammatory cytokines are TNF-α and IL-6. Both are 
unfavorable condition. associated with organ damage and mortality, which helps in predicting patient 
Prognostic To determine a patient’s risk of experiencing a clinical outcomes (26, 46). TNF-α has a very short half-life of about 17 minutes, 
event, a disease recurrence, or a disease progression in whereas IL-6 persists in the bloodstream for several hours (typically 24 to 48 
relation to a certain illness or condition. hours). This stability makes IL-6 more practical for assessing inflammation 
Safety To determine the toxicity of a medical intervention as and predicting prognosis (26, 45, 46). IL-1 is another key pro-inflammatory 
an adverse event. cytokine that triggers the body’s inflammatory response during infections. IL-
Susceptibility/risk To identify a person who does not already have a 1β, the most extensively studied form of IL-1, is activated by caspase-1 (48). 
clinically obvious disease or medical condition but has Elevated levels of IL-1β, reaching 1.22 pg/ml, can indicate a high risk of death 
the risk of getting the disease or medical condition in within 48 hours in septic patients (49). Furthermore, Interleukin-27 (IL-27) is 
the future. an immunosuppressive cytokine that is often found at elevated levels in septic 
*Definitions adopted from Califf (43) patients (50, 51). IL-27 is considered a reliable biomarker for sepsis, showing 
high specificity and sensitivity for bacterial infections (52). 
Sepsis leads to alterations in the expression and function of various 
endogenous molecules, which often reflect the level of immune activation or Cell-surface biomarkers  
suppression. These changes can be monitored to observe shifts in the immune Various cell surface markers are used to assess the activation status of 
response over time. However, because many of these molecules may only neutrophils, monocytes, and T-cells. CD64 is one of the most extensively 
temporarily reflect the immunological status, their interpretation must be done studied myeloid markers, reflecting the early stages and prognosis of diseases, 
with caution. While they may have potential as biomarkers, the heterogeneity and it has shown considerable effectiveness in the early detection of sepsis in 
in immune responses has posed significant challenges for research. newborns (46). Other surface proteins, such as CD14 and LBP, also increase 
during infection. LBP interacts with CD14 on the surface of monocytes and 
macrophages, leading to the release of a soluble form of CD14. Soluble CD14 
26 27 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
1.5.3 BLOOD BIOMARKERS  For years, researchers have examined blood levels of immune system proteins 
 to identify sepsis. Despite numerous studies, sepsis-specific biomarkers remain 
elusive. Over 250 biomarkers have been proposed for sepsis diagnosis, but 
A biomarker is defined as “a characteristic that is measured as an indicator of only a few are used in clinical practice, and they often suffer from low 
natural biological processes, pathogenic processes, or response to exposure or sensitivity and specificity (26, 44). Nevertheless, these biomarkers continue to 
intervention” (43). Biomarkers have a wide range of potential applications, be valuable for predicting sepsis risk, stratifying patients, evaluating treatment 
including disease diagnosis, prognosis, and treatment monitoring, as well as in efficacy, and assessing prognosis (26, 45, 46). Below, some of the most 
research and drug development (see Table 2 for more details). researched protein biomarkers for sepsis are described. 
Table 2. Application of biomarkers* Acute phase reactants  
Peripheral blood contains acute phase reactants like CRP and Procalcitonin 
Diagnostic To identify a person who has a particular disease. (PCT), both of which rise rapidly in septic patients. These markers are useful 
Monitoring To examine serially the status of a disease or medical indicators of inflammation and infection and are often used in conjunction with 
condition for signs of exposure to a medical product or other variables to diagnose sepsis and assess the effectiveness of treatment (26, 
environmental agent, or a biological agent’s or medical 45-47).  
product’s effects.  Cytokines  
Pharmacodynamic To demonstrate how the body responds to a medication Cytokines are signaling molecules released by various cell types, including 
or environmental factor. immune cells like monocytes, macrophages, and lymphocytes, as well as 
Predictive To predict whether an individual or a group of endothelial cells, fibroblasts, and stromal cells (26). Among the most 
individuals is more likely to experience a favorable or investigated pro-inflammatory cytokines are TNF-α and IL-6. Both are 
unfavorable condition. associated with organ damage and mortality, which helps in predicting patient 
Prognostic To determine a patient’s risk of experiencing a clinical outcomes (26, 46). TNF-α has a very short half-life of about 17 minutes, 
event, a disease recurrence, or a disease progression in whereas IL-6 persists in the bloodstream for several hours (typically 24 to 48 
relation to a certain illness or condition. hours). This stability makes IL-6 more practical for assessing inflammation 
Safety To determine the toxicity of a medical intervention as and predicting prognosis (26, 45, 46). IL-1 is another key pro-inflammatory 
an adverse event. cytokine that triggers the body’s inflammatory response during infections. IL-
Susceptibility/risk To identify a person who does not already have a 1β, the most extensively studied form of IL-1, is activated by caspase-1 (48). 
clinically obvious disease or medical condition but has Elevated levels of IL-1β, reaching 1.22 pg/ml, can indicate a high risk of death 
the risk of getting the disease or medical condition in within 48 hours in septic patients (49). Furthermore, Interleukin-27 (IL-27) is 
the future. an immunosuppressive cytokine that is often found at elevated levels in septic 
*Definitions adopted from Califf (43) patients (50, 51). IL-27 is considered a reliable biomarker for sepsis, showing 
high specificity and sensitivity for bacterial infections (52). 
Sepsis leads to alterations in the expression and function of various 
endogenous molecules, which often reflect the level of immune activation or Cell-surface biomarkers  
suppression. These changes can be monitored to observe shifts in the immune Various cell surface markers are used to assess the activation status of 
response over time. However, because many of these molecules may only neutrophils, monocytes, and T-cells. CD64 is one of the most extensively 
temporarily reflect the immunological status, their interpretation must be done studied myeloid markers, reflecting the early stages and prognosis of diseases, 
with caution. While they may have potential as biomarkers, the heterogeneity and it has shown considerable effectiveness in the early detection of sepsis in 
in immune responses has posed significant challenges for research. newborns (46). Other surface proteins, such as CD14 and LBP, also increase 
during infection. LBP interacts with CD14 on the surface of monocytes and 
macrophages, leading to the release of a soluble form of CD14. Soluble CD14 
26 27 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
(sCD14) is emerging as a highly promising biomarker for monocytes, as its 2 AIM 
levels rise earlier than IL-6 and PCT. Consequently, sCD14 is considered a key 
predictive biomarker for diagnosing and prognosticating sepsis (26, 46).  
  
Other biomarkers The overall aim of this project is to advance the understanding and diagnosis 
Currently, lactate is the most used biomarker for assessing organ dysfunction capabilities related to sepsis by investigating the molecular characteristics of 
in sepsis. Elevated lactate levels occur in hypoxic conditions due to increased pathogens and the host biological responses. Specifically, our aims are:  
glycolysis and reduced tissue oxygenation. This elevation can also indicate 
impaired lactate clearance by the liver and correlates with an increased risk of  
mortality in patients with hospital-acquired sepsis. Serial monitoring of lactate 
levels can aid in predicting mortality rates and assist in risk classification. I. To benchmark a cloud-based diagnostic software for S. aureus Whole 
Many hospitals use a lactate threshold of greater than 2 mmol/L (or >18 Genome Sequencing (WGS) data against conventional methods 
mg/dL) in the absence of hypovolemia as a criterion to screen for septic shock currently used in clinical laboratories 
(6). In addition to lactate, emerging research is exploring other biomarkers for 
sepsis diagnosis and prognosis. Some studies have found that miRNA and II. To investigate the antibiotic susceptibility pattern and epidemiology 
plasma cell-free DNA can diagnose sepsis and predict mortality in septic of S. aureus infection in the Skaraborg region, a western region in 
patients more effectively compared to healthy controls (26, 53, 54). Sweden 
Identifying precise biomarkers for diagnosing sepsis that can be used in routine III. To identify protein markers that can discriminate between bacterial 
clinical practice remains challenging. It is unlikely that a single biomarker will infections caused by gram-positive and gram-negative bacteria 
be sufficient for assessing a patient’s immunological status comprehensively. 
Therefore, given the limitations of individual markers, combining multiple IV. To determine transcriptomic markers that distinguish between E. coli 
biomarkers may boost both sensitivity and specificity for detecting sepsis. and S. aureus-induced sepsis  
In recent years, large-scale molecular analysis techniques have enabled the  
simultaneous study of multiple biomarkers. Advances in these techniques, such 
as high-throughput proteomics and transcriptomics, facilitate the  
comprehensive analysis of biological samples, specifically focusing on 
proteins and genes. Machine learning methods are being increasingly utilized  
to extract key predictive biomarkers from these complex datasets, which often 
involve a limited number of samples but a vast array of molecules to analyze  
(55, 56). By applying machine learning to proteomic and transcriptomic data, 
researchers can uncover patterns and correlations that are not readily visible  
through traditional methods. Integrating robust multi-biomarker profiles with 
machine learning techniques can lead to more accurate differentiation of  
disease stages and more precise identification of specific causes of illness in  
patients. This approach promises to enhance diagnostic and prognostic 
capabilities by providing deeper insights into the underlying biological  
processes. 
 
 
 
28 29 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
(sCD14) is emerging as a highly promising biomarker for monocytes, as its 2 AIM 
levels rise earlier than IL-6 and PCT. Consequently, sCD14 is considered a key 
predictive biomarker for diagnosing and prognosticating sepsis (26, 46).  
  
Other biomarkers The overall aim of this project is to advance the understanding and diagnosis 
Currently, lactate is the most used biomarker for assessing organ dysfunction capabilities related to sepsis by investigating the molecular characteristics of 
in sepsis. Elevated lactate levels occur in hypoxic conditions due to increased pathogens and the host biological responses. Specifically, our aims are:  
glycolysis and reduced tissue oxygenation. This elevation can also indicate 
impaired lactate clearance by the liver and correlates with an increased risk of  
mortality in patients with hospital-acquired sepsis. Serial monitoring of lactate 
levels can aid in predicting mortality rates and assist in risk classification. I. To benchmark a cloud-based diagnostic software for S. aureus Whole 
Many hospitals use a lactate threshold of greater than 2 mmol/L (or >18 Genome Sequencing (WGS) data against conventional methods 
mg/dL) in the absence of hypovolemia as a criterion to screen for septic shock currently used in clinical laboratories 
(6). In addition to lactate, emerging research is exploring other biomarkers for 
sepsis diagnosis and prognosis. Some studies have found that miRNA and II. To investigate the antibiotic susceptibility pattern and epidemiology 
plasma cell-free DNA can diagnose sepsis and predict mortality in septic of S. aureus infection in the Skaraborg region, a western region in 
patients more effectively compared to healthy controls (26, 53, 54). Sweden 
Identifying precise biomarkers for diagnosing sepsis that can be used in routine III. To identify protein markers that can discriminate between bacterial 
clinical practice remains challenging. It is unlikely that a single biomarker will infections caused by gram-positive and gram-negative bacteria 
be sufficient for assessing a patient’s immunological status comprehensively. 
Therefore, given the limitations of individual markers, combining multiple IV. To determine transcriptomic markers that distinguish between E. coli 
biomarkers may boost both sensitivity and specificity for detecting sepsis. and S. aureus-induced sepsis  
In recent years, large-scale molecular analysis techniques have enabled the  
simultaneous study of multiple biomarkers. Advances in these techniques, such 
as high-throughput proteomics and transcriptomics, facilitate the  
comprehensive analysis of biological samples, specifically focusing on 
proteins and genes. Machine learning methods are being increasingly utilized  
to extract key predictive biomarkers from these complex datasets, which often 
involve a limited number of samples but a vast array of molecules to analyze  
(55, 56). By applying machine learning to proteomic and transcriptomic data, 
researchers can uncover patterns and correlations that are not readily visible  
through traditional methods. Integrating robust multi-biomarker profiles with 
machine learning techniques can lead to more accurate differentiation of  
disease stages and more precise identification of specific causes of illness in  
patients. This approach promises to enhance diagnostic and prognostic 
capabilities by providing deeper insights into the underlying biological  
processes. 
 
 
 
28 29 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
3 MATERIALS AND METHODS 3.1 SUBJECTS 
 
Papers I, II, and III are part of a prospective observational study of community-
onset severe sepsis and septic shock in Swedish adults conducted from 
September 2011 to June 2012 at Skaraborg Hospital, a secondary hospital with 
640 beds, in the western region of Sweden (Skaraborg sepsis study). All 
patients ≥18 years admitted to the emergency department for suspicion of 
community-onset sepsis who have given their written consent were enrolled in 
the study (57). 
Prior to the administration of antibiotic therapy, signs and symptoms, and 
clinical and laboratory data were recorded and collected. Two senior infectious 
disease specialists retrospectively reviewed all medical data to assess if the 
patients met Sepsis-3 criteria. Bacterial infection was verified by either 
identification of relevant bacteria by culture, or as typical clinical symptoms, 
such as erysipelas.  
Bacterial isolates obtained from various patient culture samples were utilized 
for Papers I and II. A total of 272 S. aureus isolates from 212 patients were 
defrosted after being identified by culturing and the MALDI-TOF MS (DB-
4110) technique, then cryopreserved at -80°C. Five isolates were irretrievable 
after freezing. In Paper I, 267 isolates were processed for DNA extraction and 
whole genome sequencing. After quality control of the raw data, the output 
files for three isolates were removed from the dataset. The study is based on 
the output files for the remaining 264 S. aureus isolates with high-quality 
sequence data. 
Genetic analysis for classification of the S. aureus isolates revealed that two of 
them had been misclassified and were not S. aureus. Therefore, in Paper II, we 
adjusted the data accordingly and examined laboratory records related to 262 
S. aureus strains isolated from 212 patients aged 18 to 97 years. 
In Paper III, blood samples were obtained from 291 patients with confirmed 
bacterial infections, including 184 with gram-negative strains and 107 with 
gram-positive strains, as well as from 40 healthy controls. The blood was 
drawn into sodium citrate tubes, centrifuged, and the plasma was stored at -80 
°C until analysis using proximity extension assay technology. Fifty samples 
failed quality control after protein quantification, resulting in a final count of 
246 patient samples and 35 samples from healthy controls. Of the 246 patient 
samples, 154 contained gram-negative bacteria and 92 contained gram-positive 
30 31 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
3 MATERIALS AND METHODS 3.1 SUBJECTS 
 
Papers I, II, and III are part of a prospective observational study of community-
onset severe sepsis and septic shock in Swedish adults conducted from 
September 2011 to June 2012 at Skaraborg Hospital, a secondary hospital with 
640 beds, in the western region of Sweden (Skaraborg sepsis study). All 
patients ≥18 years admitted to the emergency department for suspicion of 
community-onset sepsis who have given their written consent were enrolled in 
the study (57). 
Prior to the administration of antibiotic therapy, signs and symptoms, and 
clinical and laboratory data were recorded and collected. Two senior infectious 
disease specialists retrospectively reviewed all medical data to assess if the 
patients met Sepsis-3 criteria. Bacterial infection was verified by either 
identification of relevant bacteria by culture, or as typical clinical symptoms, 
such as erysipelas.  
Bacterial isolates obtained from various patient culture samples were utilized 
for Papers I and II. A total of 272 S. aureus isolates from 212 patients were 
defrosted after being identified by culturing and the MALDI-TOF MS (DB-
4110) technique, then cryopreserved at -80°C. Five isolates were irretrievable 
after freezing. In Paper I, 267 isolates were processed for DNA extraction and 
whole genome sequencing. After quality control of the raw data, the output 
files for three isolates were removed from the dataset. The study is based on 
the output files for the remaining 264 S. aureus isolates with high-quality 
sequence data. 
Genetic analysis for classification of the S. aureus isolates revealed that two of 
them had been misclassified and were not S. aureus. Therefore, in Paper II, we 
adjusted the data accordingly and examined laboratory records related to 262 
S. aureus strains isolated from 212 patients aged 18 to 97 years. 
In Paper III, blood samples were obtained from 291 patients with confirmed 
bacterial infections, including 184 with gram-negative strains and 107 with 
gram-positive strains, as well as from 40 healthy controls. The blood was 
drawn into sodium citrate tubes, centrifuged, and the plasma was stored at -80 
°C until analysis using proximity extension assay technology. Fifty samples 
failed quality control after protein quantification, resulting in a final count of 
246 patient samples and 35 samples from healthy controls. Of the 246 patient 
samples, 154 contained gram-negative bacteria and 92 contained gram-positive 
30 31 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
bacteria. The proximity extension assay technology quantified a total of 368 3.2 METHODS 
proteins, including nine duplicate proteins and one triplicate protein due to 
overlapping panels.  
In Paper IV, microarray data with accession number GSE33341 was retrieved Reference microbial species identification (Paper I) 
from the GEO database (www.ncbi.nlm.nih.gov/geo/) using specific keywords Microbiological culturing and species determination for the paper were 
like ‘sepsis’, ‘S. aureus’, ‘E. coli’ and ‘array’ for further analysis and performed via MALDI-TOF MS, as previously described by Ljungström et al. 
investigation. This data was part of a prospective observational study (59), using a Microflex LT mass spectrometer (Bruker Daltonics, Leipzig, 
sponsored by the NIH, conducted between December 2005 and July 2010. The Germany) and BioTyper software v2.0 with default parameter settings. The 
NIH study aimed to advance the development of innovative diagnostic tests for cut-off value for species identification was set above a score of 2.0. The study 
severe sepsis and community-acquired pneumonia, involving four medical utilized the Bruker microorganism database MBT Compass Library DB-4110 
centers in the USA (ClinicalTrials.gov NCT00258869) (58). (Bruker Daltonics, Germany), released in April 2011. 
The study included adult patients admitted to the emergency department with Reference antibiotic susceptibility (Paper I, Paper II) 
sepsis. The subjects in this report had confirmed cases of monomicrobial Antibiotic susceptibility test (AST) was performed by disk diffusion on 
bloodstream infection caused by either E. coli (n=19, age range 25–91) or S. Mueller Hinton media at clinical laboratory Unilabs at Skaraborg Hospital, 
aureus (n=32, age range 24–91). Additionally, there were 43 uninfected Skövde, Sweden according to European Committee on Antimicrobial 
controls (age range 21–59). Whole blood samples were collected either on the Susceptibility Testing (EUCAST) guidelines (www.eucast.org). AST findings 
day of the hospital presentation or within 24 hours before the initiation of obtained from identified S. aureus are referred to as phenotypic AST in study 
treatment. The microarray technique was employed for gene analysis in the I. The AST results for isoxazolyl penicillin and cefoxitin resistance were 
study, revealing a total of 22,277 genes. followed by mecA detection by PCR to confirm the isolate as a methicillin-
resistant S. aureus (MRSA). The AST phenotypic results described in Paper I 
are restricted to the antibiotics available on the 1928 platform. In Paper II, AST 
was limited to ciprofloxacin, clindamycin, erythromycin, isoxazolyl penicillin, 
penicillin G, penicillin V, piperacillin, fusidic acid, and vancomycin. 
Whole genome sequencing (Paper I) 
Strains of S. aureus were cultured on typical blood agar, and DNAs were 
extracted using the automatic MagNa Pure 96 instrument DNA and Viral NA 
Small Volume kit with the Pathogen Universal 200 protocol (Roche 
Diagnostics, Switzerland) at Unilabs, Skövde. The DNA concentration was 
determined using the Qubit dsDNA HS assay kit on Qubit 3.0 (Thermo Fisher 
Scientific, USA) and a NanoDrop spectrophotometer (Thermo Fisher 
Scientific, USA). Library preparation was performed according to Illumina’s 
guideline for Nextera XT, and DNAs were sequenced using the Illumina HiSeq 
2500 platform by the high-throughput protocol for bacterial genomes at 
SciLifeLab, Solna, Sweden.  
Bioinformatics analysis (Paper I) 
A manual in-house pipeline (INH) was set up for Illumina pair-end (PE) read 
libraries consisting of various steps for quality control, trimming, assembly, 
and annotation of the contigs in FASTA format. The PE sequenced reads were 
32 33 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
bacteria. The proximity extension assay technology quantified a total of 368 3.2 METHODS 
proteins, including nine duplicate proteins and one triplicate protein due to 
overlapping panels.  
In Paper IV, microarray data with accession number GSE33341 was retrieved Reference microbial species identification (Paper I) 
from the GEO database (www.ncbi.nlm.nih.gov/geo/) using specific keywords Microbiological culturing and species determination for the paper were 
like ‘sepsis’, ‘S. aureus’, ‘E. coli’ and ‘array’ for further analysis and performed via MALDI-TOF MS, as previously described by Ljungström et al. 
investigation. This data was part of a prospective observational study (59), using a Microflex LT mass spectrometer (Bruker Daltonics, Leipzig, 
sponsored by the NIH, conducted between December 2005 and July 2010. The Germany) and BioTyper software v2.0 with default parameter settings. The 
NIH study aimed to advance the development of innovative diagnostic tests for cut-off value for species identification was set above a score of 2.0. The study 
severe sepsis and community-acquired pneumonia, involving four medical utilized the Bruker microorganism database MBT Compass Library DB-4110 
centers in the USA (ClinicalTrials.gov NCT00258869) (58). (Bruker Daltonics, Germany), released in April 2011. 
The study included adult patients admitted to the emergency department with Reference antibiotic susceptibility (Paper I, Paper II) 
sepsis. The subjects in this report had confirmed cases of monomicrobial Antibiotic susceptibility test (AST) was performed by disk diffusion on 
bloodstream infection caused by either E. coli (n=19, age range 25–91) or S. Mueller Hinton media at clinical laboratory Unilabs at Skaraborg Hospital, 
aureus (n=32, age range 24–91). Additionally, there were 43 uninfected Skövde, Sweden according to European Committee on Antimicrobial 
controls (age range 21–59). Whole blood samples were collected either on the Susceptibility Testing (EUCAST) guidelines (www.eucast.org). AST findings 
day of the hospital presentation or within 24 hours before the initiation of obtained from identified S. aureus are referred to as phenotypic AST in study 
treatment. The microarray technique was employed for gene analysis in the I. The AST results for isoxazolyl penicillin and cefoxitin resistance were 
study, revealing a total of 22,277 genes. followed by mecA detection by PCR to confirm the isolate as a methicillin-
resistant S. aureus (MRSA). The AST phenotypic results described in Paper I 
are restricted to the antibiotics available on the 1928 platform. In Paper II, AST 
was limited to ciprofloxacin, clindamycin, erythromycin, isoxazolyl penicillin, 
penicillin G, penicillin V, piperacillin, fusidic acid, and vancomycin. 
Whole genome sequencing (Paper I) 
Strains of S. aureus were cultured on typical blood agar, and DNAs were 
extracted using the automatic MagNa Pure 96 instrument DNA and Viral NA 
Small Volume kit with the Pathogen Universal 200 protocol (Roche 
Diagnostics, Switzerland) at Unilabs, Skövde. The DNA concentration was 
determined using the Qubit dsDNA HS assay kit on Qubit 3.0 (Thermo Fisher 
Scientific, USA) and a NanoDrop spectrophotometer (Thermo Fisher 
Scientific, USA). Library preparation was performed according to Illumina’s 
guideline for Nextera XT, and DNAs were sequenced using the Illumina HiSeq 
2500 platform by the high-throughput protocol for bacterial genomes at 
SciLifeLab, Solna, Sweden.  
Bioinformatics analysis (Paper I) 
A manual in-house pipeline (INH) was set up for Illumina pair-end (PE) read 
libraries consisting of various steps for quality control, trimming, assembly, 
and annotation of the contigs in FASTA format. The PE sequenced reads were 
32 33 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
assessed for initial quality control using FastQC (v.0.11.8) (60). Trimmomatic  
(v.0.36) (61) was employed to remove adapter sequences and filter out low-
quality reads. The tool used a sliding window approach with a window size of 
four and a quality threshold of Q20. The first 12 bases were trimmed using the 
HEADCROP argument to correct for nucleotide bias. Reads longer than 30 
bases after these trimming steps were retained for subsequent analysis. De 
novo genome assembly was done using SPAdes (v.3.13.1) (62). Assembly 
quality control was performed by QUAST (v.5.0.2) (63). The criteria for a 
good assembly included a large N50, a low number of contigs, sufficiently 
many contigs longer than 10,000 bp, and a genome size close to 2.8 Mbp, 
which is the size of S. aureus (GenBank accession number NC_007795.1). For 
assemblies that did not meet these criteria, the median coverage was also 
calculated using R (v.3.5) (64). FASTA files with contigs were annotated by  
tools available in the Center for Genomic Epidemiology 
(http://www.genomicepidemiology.org/). 16s rRNA-based species Figure 5. Workflow of the in-house pipeline. The in-house pipeline diagram indicating manual 
identification of S. aureus was performed using SpeciesFinder (65). K-mer- workflow of A. sequencing PE FASTQ files, B. quality control and trimming, C. assembly and 
based species identification was performed by KmerFinder 3.1 for bacterial evaluation, D. annotation of the assembled contigs in FASTA format using Multi Locus 
organism’s database with k-mer size 16 and the prefix “ATG” (65-67). Isolates Sequence Typing tool (MLST), virulence genes identification tool (VirulenceFinder), resistance 
identified as non-S. aureus were further verified through taxonomic analysis gene detection tool (ResFinder), species prediction tool using K-mer algorithm (KmerFinder), species prediction tool using the S16 ribosomal DNA sequence (SpeciesFinder), and JSpecies 
based on average nucleotide identity using the JSpeciesWS tool (v.3.4.0) (68) Web Server for pairwise genome comparison of prokaryotic species. 
and were excluded from subsequent analysis. Acquired antimicrobial 
resistance genes in the total sequenced S. aureus isolates were identified using  
ResFinder 3.1 (69). The analysis utilized the default settings, which included Comparative evaluation of the bioinformatic workflows (Paper I) 
all antimicrobial databases, with a threshold of 90% for identity and 60% for Species identification results from MALDI-TOF MS and phenotypic AST 
minimum length. MLST 2.0 (70) was employed for multi locus sequence were compared with the genotypic results obtained from the INH and 1928 
typing of S. aureus genomes. The S. aureus sequences were aligned against workflows. The level of agreement between these results was evaluated. For 
seven MLST loci including arcc, aroe, glpf, gmk, pta, tpi, and ygil. Virulence the AST comparison, very major errors (VMEs) and major errors (MEs) were 
factors were predicted using VirulenceFinder 2.0 (71). S. aureus was selected assessed: a VME was defined as a phenotype showing resistance with a 
as the species with a default setting of a 90% threshold for ID and a minimum genotypic prediction of susceptibility (false negative), while an ME was 
length of 60% (Figure 5). defined as a phenotype showing susceptibility with a genotypic prediction of 
The S. aureus WGS data were also analyzed by a commercial cloud-based resistance (false positive) (72). The virulence and sequence types of genes 
platform; 1928 (1928 Diagnostics, Gothenburg, Sweden). Raw PE fastq.gz obtained from the two bioinformatics workflows were compared with each 
files were uploaded and processed through the S. aureus pipeline. The platform other. 
employed a sequencing depth greater than 30x as an initial quality control 
measure. The pipeline performed species identification, assessed antibiotic Proteomic quantification (Paper III) 
susceptibility, and analyzed virulence genes using a K-mer-based assembly Protein biomarkers were quantified using Proximity Extension Assay (PEA) 
method. According to a discussion with the platform maintainer (1928 technology with four Olink panels at TATAA Biocenter, Gothenburg, Sweden. 
Diagnostics, Sweden), the S. aureus pipeline was not upgraded during the The PEA works by amplifying two complementary DNA molecules that bind 
access period of June and July 2019.  to each other when their attached antibodies bind to the same target protein in 
 close proximity. The readout involves a cycle of quantification values obtained 
by real-time PCR, which are then converted to normalized protein expression 
values (NPX) analyzed in log2 scale units (73) (Figure 6). Four Olink panels 
34 35 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
assessed for initial quality control using FastQC (v.0.11.8) (60). Trimmomatic  
(v.0.36) (61) was employed to remove adapter sequences and filter out low-
quality reads. The tool used a sliding window approach with a window size of 
four and a quality threshold of Q20. The first 12 bases were trimmed using the 
HEADCROP argument to correct for nucleotide bias. Reads longer than 30 
bases after these trimming steps were retained for subsequent analysis. De 
novo genome assembly was done using SPAdes (v.3.13.1) (62). Assembly 
quality control was performed by QUAST (v.5.0.2) (63). The criteria for a 
good assembly included a large N50, a low number of contigs, sufficiently 
many contigs longer than 10,000 bp, and a genome size close to 2.8 Mbp, 
which is the size of S. aureus (GenBank accession number NC_007795.1). For 
assemblies that did not meet these criteria, the median coverage was also 
calculated using R (v.3.5) (64). FASTA files with contigs were annotated by  
tools available in the Center for Genomic Epidemiology 
(http://www.genomicepidemiology.org/). 16s rRNA-based species Figure 5. Workflow of the in-house pipeline. The in-house pipeline diagram indicating manual 
identification of S. aureus was performed using SpeciesFinder (65). K-mer- workflow of A. sequencing PE FASTQ files, B. quality control and trimming, C. assembly and 
based species identification was performed by KmerFinder 3.1 for bacterial evaluation, D. annotation of the assembled contigs in FASTA format using Multi Locus 
organism’s database with k-mer size 16 and the prefix “ATG” (65-67). Isolates Sequence Typing tool (MLST), virulence genes identification tool (VirulenceFinder), resistance 
identified as non-S. aureus were further verified through taxonomic analysis gene detection tool (ResFinder), species prediction tool using K-mer algorithm (KmerFinder), species prediction tool using the S16 ribosomal DNA sequence (SpeciesFinder), and JSpecies 
based on average nucleotide identity using the JSpeciesWS tool (v.3.4.0) (68) Web Server for pairwise genome comparison of prokaryotic species. 
and were excluded from subsequent analysis. Acquired antimicrobial 
resistance genes in the total sequenced S. aureus isolates were identified using  
ResFinder 3.1 (69). The analysis utilized the default settings, which included Comparative evaluation of the bioinformatic workflows (Paper I) 
all antimicrobial databases, with a threshold of 90% for identity and 60% for Species identification results from MALDI-TOF MS and phenotypic AST 
minimum length. MLST 2.0 (70) was employed for multi locus sequence were compared with the genotypic results obtained from the INH and 1928 
typing of S. aureus genomes. The S. aureus sequences were aligned against workflows. The level of agreement between these results was evaluated. For 
seven MLST loci including arcc, aroe, glpf, gmk, pta, tpi, and ygil. Virulence the AST comparison, very major errors (VMEs) and major errors (MEs) were 
factors were predicted using VirulenceFinder 2.0 (71). S. aureus was selected assessed: a VME was defined as a phenotype showing resistance with a 
as the species with a default setting of a 90% threshold for ID and a minimum genotypic prediction of susceptibility (false negative), while an ME was 
length of 60% (Figure 5). defined as a phenotype showing susceptibility with a genotypic prediction of 
The S. aureus WGS data were also analyzed by a commercial cloud-based resistance (false positive) (72). The virulence and sequence types of genes 
platform; 1928 (1928 Diagnostics, Gothenburg, Sweden). Raw PE fastq.gz obtained from the two bioinformatics workflows were compared with each 
files were uploaded and processed through the S. aureus pipeline. The platform other. 
employed a sequencing depth greater than 30x as an initial quality control 
measure. The pipeline performed species identification, assessed antibiotic Proteomic quantification (Paper III) 
susceptibility, and analyzed virulence genes using a K-mer-based assembly Protein biomarkers were quantified using Proximity Extension Assay (PEA) 
method. According to a discussion with the platform maintainer (1928 technology with four Olink panels at TATAA Biocenter, Gothenburg, Sweden. 
Diagnostics, Sweden), the S. aureus pipeline was not upgraded during the The PEA works by amplifying two complementary DNA molecules that bind 
access period of June and July 2019.  to each other when their attached antibodies bind to the same target protein in 
 close proximity. The readout involves a cycle of quantification values obtained 
by real-time PCR, which are then converted to normalized protein expression 
values (NPX) analyzed in log2 scale units (73) (Figure 6). Four Olink panels 
34 35 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
—Cardiometabolic (CM) (v.3603), Cardiovascular II (CVD II) (v.5004), Comparative feature selection (Paper III) 
Immune Response (IR) (v.3202), and Inflammation (Inf) (v.3021) (Olink The ten datasets, each containing varying degrees of imputed missing data 
Biosciences, Uppsala, Sweden)—were used for quantifying proteins in (5%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, and 80%.), were subjected 
patients with sepsis and healthy control individuals. to feature selection methods, and the classifier’s prediction accuracy was 
measured. To assess the algorithms’ predictive performance, each dataset was 
split randomly into training and test datasets, with 80% and 20% of the samples 
assigned to each group, respectively. 
Three widely used supervised algorithms were evaluated to identify the most 
effective method for predicting proteins in our dataset.  
1) Random forest (RF) (75) generates multiple classification trees from 
data samples and selects the class with the highest number of votes as 
the output for a classification problem. We created RF models with 
100 decision trees, each built on bootstrap samples from the training 
set, using Gini impurity as the quality measure for decision-making. 
To address the imbalance in sample numbers across groups, we 
implemented cost-sensitive learning. The number of features 
randomly selected at each decision point was explored separately in 
each dataset, ranging from 1 to 20. We performed ten-fold cross-
validation with three repetitions on the entire dataset to determine the 
optimal number of features, calculating the accuracy score and 
standard deviations (std). The feature number with the highest 
accuracy and lowest std was selected as the most optimal. The RF 
model was then used to predict the test set, and we evaluated its 
Figure 6. Simplified overview of Proximity Extension Assay (PEA) technology. Antigen-specific performance by generating a classification report that included 
antibodies with individual DNA tags bind to a target protein. The DNA tags are complementary precision, recall (sensitivity), f1-score, and accuracy. 
to each other for the same protein and will serve as a template in real-time PCR detection. 
 2) The Least Absolute Shrinkage and Selection Operator (LASSO) is a 
Missing value handling using GSimp (Paper III) regularization technique that reduces the magnitude of the coefficient 
The limit of detection (LOD) for each PEA assay is established at three for redundant or irrelevant predictors to zero (76). We generated the 
standard deviations above the background signal. We removed protein values LASSO model on the training set using a tuned lambda value of 0.02. 
below LOD based on the manufacturer’s instruction and defined the missing This lambda value was determined automatically by generating the 
values as Missing Not At Random (MNAR). A set of percentages of MNAR LassoCV model on the dataset, tuning the lambda hyperparameter 
was determined to create ten datasets with <5% to <80% of expression value within the range of 0 to 1 with a step size of 0.01. Ten-fold cross-
below LOD in each group. GSimp (74), implemented in R (v.4.1.1) (64) and validation was employed on the entire dataset to identify the optimal 
accessible from GitHub (https://github.com/WandeRum/GSimp), was used lambda parameter, with the process repeated three times. The LASSO 
with standard settings to impute missing values based on the group. After model was then used to predict the test set, and its predictive power 
imputation, the mean expression values of the duplicate and triple proteins was assessed by calculating the mean square error (MSE). 
were calculated and replaced.  3) Recursive Feature Elimination (RFE) is a wrapper algorithm that 
utilizes various machine learning algorithms at its core (77), enabling 
36 37 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
—Cardiometabolic (CM) (v.3603), Cardiovascular II (CVD II) (v.5004), Comparative feature selection (Paper III) 
Immune Response (IR) (v.3202), and Inflammation (Inf) (v.3021) (Olink The ten datasets, each containing varying degrees of imputed missing data 
Biosciences, Uppsala, Sweden)—were used for quantifying proteins in (5%, 10%, 20%, 25%, 30%, 40%, 50%, 60%, 70%, and 80%.), were subjected 
patients with sepsis and healthy control individuals. to feature selection methods, and the classifier’s prediction accuracy was 
measured. To assess the algorithms’ predictive performance, each dataset was 
split randomly into training and test datasets, with 80% and 20% of the samples 
assigned to each group, respectively. 
Three widely used supervised algorithms were evaluated to identify the most 
effective method for predicting proteins in our dataset.  
1) Random forest (RF) (75) generates multiple classification trees from 
data samples and selects the class with the highest number of votes as 
the output for a classification problem. We created RF models with 
100 decision trees, each built on bootstrap samples from the training 
set, using Gini impurity as the quality measure for decision-making. 
To address the imbalance in sample numbers across groups, we 
implemented cost-sensitive learning. The number of features 
randomly selected at each decision point was explored separately in 
each dataset, ranging from 1 to 20. We performed ten-fold cross-
validation with three repetitions on the entire dataset to determine the 
optimal number of features, calculating the accuracy score and 
standard deviations (std). The feature number with the highest 
accuracy and lowest std was selected as the most optimal. The RF 
model was then used to predict the test set, and we evaluated its 
Figure 6. Simplified overview of Proximity Extension Assay (PEA) technology. Antigen-specific performance by generating a classification report that included 
antibodies with individual DNA tags bind to a target protein. The DNA tags are complementary precision, recall (sensitivity), f1-score, and accuracy. 
to each other for the same protein and will serve as a template in real-time PCR detection. 
 2) The Least Absolute Shrinkage and Selection Operator (LASSO) is a 
Missing value handling using GSimp (Paper III) regularization technique that reduces the magnitude of the coefficient 
The limit of detection (LOD) for each PEA assay is established at three for redundant or irrelevant predictors to zero (76). We generated the 
standard deviations above the background signal. We removed protein values LASSO model on the training set using a tuned lambda value of 0.02. 
below LOD based on the manufacturer’s instruction and defined the missing This lambda value was determined automatically by generating the 
values as Missing Not At Random (MNAR). A set of percentages of MNAR LassoCV model on the dataset, tuning the lambda hyperparameter 
was determined to create ten datasets with <5% to <80% of expression value within the range of 0 to 1 with a step size of 0.01. Ten-fold cross-
below LOD in each group. GSimp (74), implemented in R (v.4.1.1) (64) and validation was employed on the entire dataset to identify the optimal 
accessible from GitHub (https://github.com/WandeRum/GSimp), was used lambda parameter, with the process repeated three times. The LASSO 
with standard settings to impute missing values based on the group. After model was then used to predict the test set, and its predictive power 
imputation, the mean expression values of the duplicate and triple proteins was assessed by calculating the mean square error (MSE). 
were calculated and replaced.  3) Recursive Feature Elimination (RFE) is a wrapper algorithm that 
utilizes various machine learning algorithms at its core (77), enabling 
36 37 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
the exploration of algorithms suited to the data’s structure. Initially, Feature selection strategy (Paper IV) 
five algorithms were examined: logistic regression (LR), perceptron, The microarray dataset was divided into two subsets, with one containing 80% 
gradient boosting, decision tree, and support vector machine, as the of the samples for training and the other 20% for testing. Lasso feature 
base core for the RFE algorithm. After conducting ten-fold cross- selection method was applied with the configuration as previously described, 
validation on the entire dataset, repeated three times, the accuracy of using a tuned lambda value of 0.07. The predictive power of the model was 
each core algorithm was determined. Based on the results, logistic evaluated using MSE. 
regression (RFE-LR) was selected as it provided the highest accuracy 
among the core algorithms. The RFE-LR model was then generated Sample balance (Paper IV) 
with default parameters on the training set of each dataset. The model’s To address the imbalance in sample sizes based on age and gender, we 
predictive performance was assessed by predicting the test set and stratified the data into subgroups and performed upsampling. We first 
evaluating the classification report, including precision, recall, f1- equalized the sample sizes in each age subgroup and then balanced the genders 
score, and accuracy. The workflow of the machine learning approach within each bacterial group. Finally, we matched the number of samples 
is illustrated in Figure 1 paper III. We utilized accuracy as a parameter between the bacterial and healthy control groups. This resulted in a total of 228 
to assess the efficacy of each algorithm in the classification of gram- samples and 22,277 genes, allowing us to train and test the predictive genes 
positive and gram-negative bacterial infection patients. The selected with equal sample sizes for each group.  
method: Lasso, was applied to our top-ranked dataset for the selection 
of predictive proteins.   External validation (Paper IV) To ensure the generalizability of our results, we validated our predictive gene 
Protein-protein network interaction (Paper III, IV) set using two additional datasets: GSE13015 (89) and GSE65088 (90). 
The predictive proteins selected by the Lasso algorithm were followed by a GSE13015 included samples of E. coli and S. aureus from sepsis patients and healthy controls, while GSE65088 consisted exclusively of samples of E. coli 
protein-protein interaction (PPI) network based on the STRING database (78) and S. aureus cultured in healthy donors’ blood. Both datasets were processed 
The interactive relationship between the predictive proteins was then examined consistently. We matched our predictive genes with the datasets. Subsequently, 
using the Cytoscape software (79). To find clusters in the network, the we used GSE13015 to evaluate gene performance in distinguishing E. coli, S. 
Molecular Complex Detection (MCODE) plug-in (80) of Cytoscape was used aureus, and healthy controls, and GSE65088 to differentiate E. coli from S. 
which clusters proteins by highly interconnected areas. aureus samples.  
GO term and pathway enrichment analyses (Paper III, IV)  
To better explore the biological significance of predictive proteins (Paper III) 
and genes (Paper IV), first, a list of corresponding genes for the predictive 
proteins was obtained via the GeneCard resource (www.genecards.org) (81, 
82). Then functional and pathway enrichment analysis were performed using 
PANTHER (Protein ANalysis THrough Evolutionary Relationships) (83) and 
Reactome (84) resources in the Gene Ontology (GO) (85-87) interface. An 
FDR p-value <0.05 was considered a significant enrichment. 
Microarray preprocessing (Paper IV) 
Affymetrix microarrays were normalized using the Robust Multichip Average 
(RMA) technique (88). All transcripts detected in at least one sample were 
included, without any prior screening for differential expression. 
Subsequently, the probe IDs were converted to unique official gene symbols, 
and the dataset was transformed, organizing the 94 samples in rows and 22,277 
genes in columns. 
38 39 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
the exploration of algorithms suited to the data’s structure. Initially, Feature selection strategy (Paper IV) 
five algorithms were examined: logistic regression (LR), perceptron, The microarray dataset was divided into two subsets, with one containing 80% 
gradient boosting, decision tree, and support vector machine, as the of the samples for training and the other 20% for testing. Lasso feature 
base core for the RFE algorithm. After conducting ten-fold cross- selection method was applied with the configuration as previously described, 
validation on the entire dataset, repeated three times, the accuracy of using a tuned lambda value of 0.07. The predictive power of the model was 
each core algorithm was determined. Based on the results, logistic evaluated using MSE. 
regression (RFE-LR) was selected as it provided the highest accuracy 
among the core algorithms. The RFE-LR model was then generated Sample balance (Paper IV) 
with default parameters on the training set of each dataset. The model’s To address the imbalance in sample sizes based on age and gender, we 
predictive performance was assessed by predicting the test set and stratified the data into subgroups and performed upsampling. We first 
evaluating the classification report, including precision, recall, f1- equalized the sample sizes in each age subgroup and then balanced the genders 
score, and accuracy. The workflow of the machine learning approach within each bacterial group. Finally, we matched the number of samples 
is illustrated in Figure 1 paper III. We utilized accuracy as a parameter between the bacterial and healthy control groups. This resulted in a total of 228 
to assess the efficacy of each algorithm in the classification of gram- samples and 22,277 genes, allowing us to train and test the predictive genes 
positive and gram-negative bacterial infection patients. The selected with equal sample sizes for each group.  
method: Lasso, was applied to our top-ranked dataset for the selection 
of predictive proteins.   External validation (Paper IV) To ensure the generalizability of our results, we validated our predictive gene 
Protein-protein network interaction (Paper III, IV) set using two additional datasets: GSE13015 (89) and GSE65088 (90). 
The predictive proteins selected by the Lasso algorithm were followed by a GSE13015 included samples of E. coli and S. aureus from sepsis patients and healthy controls, while GSE65088 consisted exclusively of samples of E. coli 
protein-protein interaction (PPI) network based on the STRING database (78) and S. aureus cultured in healthy donors’ blood. Both datasets were processed 
The interactive relationship between the predictive proteins was then examined consistently. We matched our predictive genes with the datasets. Subsequently, 
using the Cytoscape software (79). To find clusters in the network, the we used GSE13015 to evaluate gene performance in distinguishing E. coli, S. 
Molecular Complex Detection (MCODE) plug-in (80) of Cytoscape was used aureus, and healthy controls, and GSE65088 to differentiate E. coli from S. 
which clusters proteins by highly interconnected areas. aureus samples.  
GO term and pathway enrichment analyses (Paper III, IV)  
To better explore the biological significance of predictive proteins (Paper III) 
and genes (Paper IV), first, a list of corresponding genes for the predictive 
proteins was obtained via the GeneCard resource (www.genecards.org) (81, 
82). Then functional and pathway enrichment analysis were performed using 
PANTHER (Protein ANalysis THrough Evolutionary Relationships) (83) and 
Reactome (84) resources in the Gene Ontology (GO) (85-87) interface. An 
FDR p-value <0.05 was considered a significant enrichment. 
Microarray preprocessing (Paper IV) 
Affymetrix microarrays were normalized using the Robust Multichip Average 
(RMA) technique (88). All transcripts detected in at least one sample were 
included, without any prior screening for differential expression. 
Subsequently, the probe IDs were converted to unique official gene symbols, 
and the dataset was transformed, organizing the 94 samples in rows and 22,277 
genes in columns. 
38 39 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
3.3 STATISTICAL ANALYSIS 3.4 ETHICAL CONSIDERATION 
  
Statistical analysis was performed using R (v.3.5) (64), and Jupyter Notebook For Papers I-III, the study was approved by the Regional Ethical Review Board 
(v.6.0.3) (91) within the Anaconda (v.2-2.4.0) (92) environment, utilizing the of Gothenburg (376-11). Adult patients suspected of sepsis who were admitted 
Scikit-Learn library (93). In paper I, the percentage of agreement between to the emergency department between 2011 and 2012 provided written 
conventional microbiological routines and bioinformatics workflows was informed consent. They were fully informed about the study’s purpose, their 
calculated. This measure was also applied to assess the agreement between the obligations, their right to withdraw from the study at any time, and the study’s 
two bioinformatic methods. The Agresti-Coull approach was employed to adherence to ethical standards. Plasma and whole blood samples were 
compute 95% confidence intervals for these agreement percentages.  collected and stored in a biobank. Patients had the opportunity to contact Dr. 
Lars Ljungström for clarification of any information and were informed they 
In Paper II, descriptive statistics were calculated by mean, standard deviation, could access the study results through published articles. In Paper I, which 
range, frequencies and percentages. An unpaired t-test was used to compare focused on bacterial strains, patient consent was deemed unnecessary 
the average age between male and female groups, with a significance level set according to national regulations (2003:460). In Paper II, reviewing the 
at p < 0.05. Statistical analysis was conducted using IBM SPSS version 25 patients’ electronic health records did not require individual consent at the 
(IBM Corp, USA). time. For Paper IV, an online dataset with prior ethical approval was utilized, 
eliminating the need for additional ethical clearance. 
In Paper III, bioinformatics analysis was conducted in three phases. Initially, a 
t-test was used to compare patient groups and calculate p-values, which were  
then adjusted for false discovery rate using the Benjamini and Hochberg 
method with a significance threshold of p-adj < 0.05. Next, an unsupervised  
learning approach, involving Principal Component Analysis (PCA) and t-
distributed Stochastic Neighbor Embedding (t-SNE), was employed to explore  
the grouping patterns without prior knowledge of the sample groups. Finally, 
the Lasso method was applied to select predictive proteins. The performance  
of these proteins was assessed by calculating the area under the receiver 
operating characteristic curve (AUC-ROC), and the linear relationship among  
predictive proteins was evaluated using the Pearson correlation coefficient 
(PCC).  
In Paper IV, a two-step analysis was used to identify genes distinguishing  
between E. coli and S. aureus-induced sepsis. Initially, PCA was employed to 
visualize sample relationships and gain insights into the data structure.  
Subsequently, Lasso regression was applied for feature selection to identify the 
most predictive genes. The performance of these selected genes was assessed  
using logistic regression (94) and metrics including AUC, precision, recall, F1-
score, and accuracy. The Mann-Whitney U test was utilized to evaluate  
differences in gene expression between E. coli and S. aureus-induced sepsis  
where applicable. The linear relationship between predictive genes was 
assessed using PCC, and the predictive performance of these genes on external  
datasets was evaluated using the AUC-ROC curve. 
40 41 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
3.3 STATISTICAL ANALYSIS 3.4 ETHICAL CONSIDERATION 
  
Statistical analysis was performed using R (v.3.5) (64), and Jupyter Notebook For Papers I-III, the study was approved by the Regional Ethical Review Board 
(v.6.0.3) (91) within the Anaconda (v.2-2.4.0) (92) environment, utilizing the of Gothenburg (376-11). Adult patients suspected of sepsis who were admitted 
Scikit-Learn library (93). In paper I, the percentage of agreement between to the emergency department between 2011 and 2012 provided written 
conventional microbiological routines and bioinformatics workflows was informed consent. They were fully informed about the study’s purpose, their 
calculated. This measure was also applied to assess the agreement between the obligations, their right to withdraw from the study at any time, and the study’s 
two bioinformatic methods. The Agresti-Coull approach was employed to adherence to ethical standards. Plasma and whole blood samples were 
compute 95% confidence intervals for these agreement percentages.  collected and stored in a biobank. Patients had the opportunity to contact Dr. 
Lars Ljungström for clarification of any information and were informed they 
In Paper II, descriptive statistics were calculated by mean, standard deviation, could access the study results through published articles. In Paper I, which 
range, frequencies and percentages. An unpaired t-test was used to compare focused on bacterial strains, patient consent was deemed unnecessary 
the average age between male and female groups, with a significance level set according to national regulations (2003:460). In Paper II, reviewing the 
at p < 0.05. Statistical analysis was conducted using IBM SPSS version 25 patients’ electronic health records did not require individual consent at the 
(IBM Corp, USA). time. For Paper IV, an online dataset with prior ethical approval was utilized, 
eliminating the need for additional ethical clearance. 
In Paper III, bioinformatics analysis was conducted in three phases. Initially, a 
t-test was used to compare patient groups and calculate p-values, which were  
then adjusted for false discovery rate using the Benjamini and Hochberg 
method with a significance threshold of p-adj < 0.05. Next, an unsupervised  
learning approach, involving Principal Component Analysis (PCA) and t-
distributed Stochastic Neighbor Embedding (t-SNE), was employed to explore  
the grouping patterns without prior knowledge of the sample groups. Finally, 
the Lasso method was applied to select predictive proteins. The performance  
of these proteins was assessed by calculating the area under the receiver 
operating characteristic curve (AUC-ROC), and the linear relationship among  
predictive proteins was evaluated using the Pearson correlation coefficient 
(PCC).  
In Paper IV, a two-step analysis was used to identify genes distinguishing  
between E. coli and S. aureus-induced sepsis. Initially, PCA was employed to 
visualize sample relationships and gain insights into the data structure.  
Subsequently, Lasso regression was applied for feature selection to identify the 
most predictive genes. The performance of these selected genes was assessed  
using logistic regression (94) and metrics including AUC, precision, recall, F1-
score, and accuracy. The Mann-Whitney U test was utilized to evaluate  
differences in gene expression between E. coli and S. aureus-induced sepsis  
where applicable. The linear relationship between predictive genes was 
assessed using PCC, and the predictive performance of these genes on external  
datasets was evaluated using the AUC-ROC curve. 
40 41 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4 RESULTS AND DISCUSSION 4.1 PAPER I- BENCHMARKING OF TWO 
BIOINFORMATICS WORKFLOWS  
 
NGS technologies, particularly bacterial WGS, offer the potential to transform 
clinical microbiology by providing rapid and precise identification of bacterial 
sepsis, antibiotic resistance genes, and virulence genes (95). However, the 
clinical adoption of WGS has been slow due to the lack of appropriate 
platforms (96). Recently, an automated bioinformatics platform 1928 
Diagnostics (Gothenburg, Sweden), designed for WGS analysis, has been 
employed in the study of Staphylococcus argenteus (S. argenteus), MRSA, and 
Klebsiella spp. (97-100). 
In this study, our objective was to assess the performance of 1928 Diagnostics 
platform in identifying S. aureus isolates from sepsis patients, as well as in 
detecting resistance genes, and virulence factors and performing multi locus 
sequence typing. We developed an INH pipeline (discussed in chapter 3.2) and 
compared the results of the 1928 platform and the INH pipeline with those 
from MALDI-TOF MS and phenotypic AST, as well as with each other. 
 
During the development of our INH pipeline, we assessed several tools for 
predicting S. aureus species including, CGE SpeciesFinder (65), CGE 
KmerFinder (66), and JSpeciesWS (68). CGE SpeciesFinder has previously 
been reported to have low accuracy for species identification (65). In our study, 
it also showed limitations in accurately identifying S. aureus, failing to identify 
61 isolates and showing a lower agreement (76.5%, 202/264) with MALDI-
TOF MS, compared to CGE KmerFinder and JSpeciesWS, which 
demonstrated a high level of agreement (99.2%, 262/264) with the same 
method (Table 3).  
Using 1928 for species identification, nine FASTQ files initially failed quality 
control due to sequencing depth/coverage requirements exceeding 30X. After 
adjusting parameters to accept a range of 11-29X, the 1928 platform showed 
high agreement (99.2%, 262/264) with MALDI-TOF MS (95% CI: 97.1-99.9) 
(Table 3). This indicates the platform’s effectiveness in identifying S. aureus 
even with reduced depth/coverage. Post-adjustment, it identified two 
discrepant isolates: SA 310 as Staphylococcus epidermidis (S. epidermidis) 
and SA 1413 as a non-staphylococcal species. CGE SpeciesFinder, CGE 
KmerFinder, and JSpeciesWS produced similar results for SA 310 as S. 
epidermidis, while CGE KmerFinder and JSpeciesWS identified SA 1413 as 
Staphylococcus argenteus (S. argenteus). 
42 43 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4 RESULTS AND DISCUSSION 4.1 PAPER I- BENCHMARKING OF TWO 
BIOINFORMATICS WORKFLOWS  
 
NGS technologies, particularly bacterial WGS, offer the potential to transform 
clinical microbiology by providing rapid and precise identification of bacterial 
sepsis, antibiotic resistance genes, and virulence genes (95). However, the 
clinical adoption of WGS has been slow due to the lack of appropriate 
platforms (96). Recently, an automated bioinformatics platform 1928 
Diagnostics (Gothenburg, Sweden), designed for WGS analysis, has been 
employed in the study of Staphylococcus argenteus (S. argenteus), MRSA, and 
Klebsiella spp. (97-100). 
In this study, our objective was to assess the performance of 1928 Diagnostics 
platform in identifying S. aureus isolates from sepsis patients, as well as in 
detecting resistance genes, and virulence factors and performing multi locus 
sequence typing. We developed an INH pipeline (discussed in chapter 3.2) and 
compared the results of the 1928 platform and the INH pipeline with those 
from MALDI-TOF MS and phenotypic AST, as well as with each other. 
 
During the development of our INH pipeline, we assessed several tools for 
predicting S. aureus species including, CGE SpeciesFinder (65), CGE 
KmerFinder (66), and JSpeciesWS (68). CGE SpeciesFinder has previously 
been reported to have low accuracy for species identification (65). In our study, 
it also showed limitations in accurately identifying S. aureus, failing to identify 
61 isolates and showing a lower agreement (76.5%, 202/264) with MALDI-
TOF MS, compared to CGE KmerFinder and JSpeciesWS, which 
demonstrated a high level of agreement (99.2%, 262/264) with the same 
method (Table 3).  
Using 1928 for species identification, nine FASTQ files initially failed quality 
control due to sequencing depth/coverage requirements exceeding 30X. After 
adjusting parameters to accept a range of 11-29X, the 1928 platform showed 
high agreement (99.2%, 262/264) with MALDI-TOF MS (95% CI: 97.1-99.9) 
(Table 3). This indicates the platform’s effectiveness in identifying S. aureus 
even with reduced depth/coverage. Post-adjustment, it identified two 
discrepant isolates: SA 310 as Staphylococcus epidermidis (S. epidermidis) 
and SA 1413 as a non-staphylococcal species. CGE SpeciesFinder, CGE 
KmerFinder, and JSpeciesWS produced similar results for SA 310 as S. 
epidermidis, while CGE KmerFinder and JSpeciesWS identified SA 1413 as 
Staphylococcus argenteus (S. argenteus). 
42 43 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
S. argenteus, a novel species identified in 2015 using the Illumina HiSeq overall agreement of 99.0% (996/1006, 95% CI: 98.1-99.5) (Table 4), while 
platform (101), presents challenges in evaluating its clinical significance due the INH workflow showed a slightly lower but still strong overall agreement 
to limited research. However, some studies have noted that its frequency, of 98.4% (990/1006, 95% CI: 97.4-99.0) (Table 4). Across all genotypic AST 
morbidity, and mortality rates are similar to those of S. aureus (102, 103), with findings, the total agreement was 99.2% (998/1006, 95% CI: 98.4-99.6) (Table 
reports of methicillin-resistant isolates (101). The first S. argenteus case in 4). 
Sweden was documented in 2018 (97), just after we completed our WGS data 
analysis. Since April 2018, both the regional laboratory’s database and the Table 4. Evaluation of predicted genotypic AST using the 1928 platform, the 
1928 platform have been revised accordingly. in-house pipeline, and phenotypic AST for 255 S. aureus isolates. 
Phenotypic AST (n) Predicted genotypic AST by  Discordant across 
Table 3. Genetic predicted species identification by the bioinformatic 1928 platform / INH methods (n [%]) 
workflows of the 264 isolates identified as S. aureus by MALDI-TOF MS. R/R S/S R/S S/R 
S (981) 1 979 1 0 2 [0.2] 
S. aureus Other species No prediction Agreement against 
Bioinformatic tool R (25) 10 8 7 0 15[60.0] 
JSpeciesWS % (95% CI) In total n* =1006 11 987 8 0 n** =17 [1.7] 
(n [%]) (n [%]) (n [%]) *Total number of cases tested with the EUCAST test. **Number of discordant results involving 
JSpeciesWS 262 [99.2] 2 [0.8] * 0 [0] both bioinformatics workflows.
 S, Susceptible; R, Resistant. Bolded items indicate 100% 
agreement. 
SpeciesFinder 202 [76.5] 1 [0.4] ** 61 [23.1] *** 76.5 (71.0-81.2)
KmerFinder 262 [99.2] 2 [0.8] * 0 [0] 100.0 (98.2-100.0) The strong agreement between the 1928 platform, INH, and AST is largely 
1928 262 [99.2]  2 [0.8] **** 0 [0] 99.6 (97.6-100.0) attributed to the alignment between susceptible cases in phenotypic AST and 
*S. epidermidis and S. argenteus **S. epidermidis ***Results lower than 98% ID match were the two bioinformatics workflows. There was a minimal discordance of just 
excluded ****S. epidermidis and unknown. 0.2% (2/981) when comparing these workflows with the genotypically 
 predicted AST (Table 4). In contrast, the greatest discordance among the three 
methods was found for antibiotic resistance, with a discordance rate of 60.0% 
We excluded the two samples identified as S. epidermidis and S. argenteus (15/25; 95% CI: 40.7–76.5) (Table 4). This observation was further supported 
from further analysis, since the study focused on S. aureus. Similarly, FASTQ by the lower combined ME rate of 0.1% (1/1006) compared to the combined 
files that failed the 1928 platform’s quality control were excluded to ensure the VME rate of 0.8% (8/ 1006) (Paper I, Table 5). This finding is consistent with 
utilization of high-quality data and enhance results accuracy and reliability. the Mason et al. study (2018), which observed higher agreement in 
Consequently, 255 S. aureus isolates were included in the further susceptibility compared to resistance genes using the three bioinformatic 
benchmarking of AST, virulence gene, and sequence type characterization. methods of Genefinder, Mykrobe, and Typewriter (105).  
Bioinformatics methods for WGS analysis have been demonstrated to be as Mason et al. (105) also reported VME for ciprofloxacin and fusidic acid, where 
sensitive and specific as routine antimicrobial susceptibility testing methods phenotypic AST indicated resistance, but genotypic AST suggested 
(72, 104, 105). We also observed an agreement of 98.0% (989/1006, 95% CI: susceptibility. Similarly, our study identified 26 discrepancies between 
97.3-99.0) between the combined predicted genotypic antibiotic susceptibility phenotypic AST and the two bioinformatics methods, including VME for 
from both bioinformatic workflows (1928 and INH) and phenotypic AST fusidic acid and ciprofloxacin. The 1928 platform exhibited a VME rate of 
(Table 4). While both methods have their advantages and limitations, the high 1.4% (1/70) for ciprofloxacin, whereas the INH platform had a higher rate of 
degree of agreement between genotypic and phenotypic AST in this study 5.7% (4/70). For fusidic acid, the 1928 platform had a VME rate of 1.9% 
suggests that genotypic AST may be a useful tool for testing antibiotic (4/206), while the INH showed a rate of 3.4% (7/206) (Paper I, Table 5). 
susceptibility in S. aureus, particularly in cases where phenotypic AST is not Consistent with our findings for the 1928 platform, Gordon et al. (2014) (72) 
feasible. The overall agreement of each workflow (1928 and INH) with the have reported VME rates of 1.4% for ciprofloxacin but a lower rate of 0.6% 
reference method was also assessed. The 1928 platform demonstrated a high for fusidic acid using bioinformatics workflows like BLASTn and tBLASTn. 
44 45 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
S. argenteus, a novel species identified in 2015 using the Illumina HiSeq overall agreement of 99.0% (996/1006, 95% CI: 98.1-99.5) (Table 4), while 
platform (101), presents challenges in evaluating its clinical significance due the INH workflow showed a slightly lower but still strong overall agreement 
to limited research. However, some studies have noted that its frequency, of 98.4% (990/1006, 95% CI: 97.4-99.0) (Table 4). Across all genotypic AST 
morbidity, and mortality rates are similar to those of S. aureus (102, 103), with findings, the total agreement was 99.2% (998/1006, 95% CI: 98.4-99.6) (Table 
reports of methicillin-resistant isolates (101). The first S. argenteus case in 4). 
Sweden was documented in 2018 (97), just after we completed our WGS data 
analysis. Since April 2018, both the regional laboratory’s database and the Table 4. Evaluation of predicted genotypic AST using the 1928 platform, the 
1928 platform have been revised accordingly. in-house pipeline, and phenotypic AST for 255 S. aureus isolates. 
Phenotypic AST (n) Predicted genotypic AST by  Discordant across 
Table 3. Genetic predicted species identification by the bioinformatic 1928 platform / INH methods (n [%]) 
workflows of the 264 isolates identified as S. aureus by MALDI-TOF MS. R/R S/S R/S S/R 
S (981) 1 979 1 0 2 [0.2] 
S. aureus Other species No prediction Agreement against 
Bioinformatic tool R (25) 10 8 7 0 15[60.0] 
JSpeciesWS % (95% CI) In total n* =1006 11 987 8 0 n** =17 [1.7] 
(n [%]) (n [%]) (n [%]) *Total number of cases tested with the EUCAST test. **Number of discordant results involving 
JSpeciesWS 262 [99.2] 2 [0.8] * 0 [0] both bioinformatics workflows.
 S, Susceptible; R, Resistant. Bolded items indicate 100% 
agreement. 
SpeciesFinder 202 [76.5] 1 [0.4] ** 61 [23.1] *** 76.5 (71.0-81.2)
KmerFinder 262 [99.2] 2 [0.8] * 0 [0] 100.0 (98.2-100.0) The strong agreement between the 1928 platform, INH, and AST is largely 
1928 262 [99.2]  2 [0.8] **** 0 [0] 99.6 (97.6-100.0) attributed to the alignment between susceptible cases in phenotypic AST and 
*S. epidermidis and S. argenteus **S. epidermidis ***Results lower than 98% ID match were the two bioinformatics workflows. There was a minimal discordance of just 
excluded ****S. epidermidis and unknown. 0.2% (2/981) when comparing these workflows with the genotypically 
 predicted AST (Table 4). In contrast, the greatest discordance among the three 
methods was found for antibiotic resistance, with a discordance rate of 60.0% 
We excluded the two samples identified as S. epidermidis and S. argenteus (15/25; 95% CI: 40.7–76.5) (Table 4). This observation was further supported 
from further analysis, since the study focused on S. aureus. Similarly, FASTQ by the lower combined ME rate of 0.1% (1/1006) compared to the combined 
files that failed the 1928 platform’s quality control were excluded to ensure the VME rate of 0.8% (8/ 1006) (Paper I, Table 5). This finding is consistent with 
utilization of high-quality data and enhance results accuracy and reliability. the Mason et al. study (2018), which observed higher agreement in 
Consequently, 255 S. aureus isolates were included in the further susceptibility compared to resistance genes using the three bioinformatic 
benchmarking of AST, virulence gene, and sequence type characterization. methods of Genefinder, Mykrobe, and Typewriter (105).  
Bioinformatics methods for WGS analysis have been demonstrated to be as Mason et al. (105) also reported VME for ciprofloxacin and fusidic acid, where 
sensitive and specific as routine antimicrobial susceptibility testing methods phenotypic AST indicated resistance, but genotypic AST suggested 
(72, 104, 105). We also observed an agreement of 98.0% (989/1006, 95% CI: susceptibility. Similarly, our study identified 26 discrepancies between 
97.3-99.0) between the combined predicted genotypic antibiotic susceptibility phenotypic AST and the two bioinformatics methods, including VME for 
from both bioinformatic workflows (1928 and INH) and phenotypic AST fusidic acid and ciprofloxacin. The 1928 platform exhibited a VME rate of 
(Table 4). While both methods have their advantages and limitations, the high 1.4% (1/70) for ciprofloxacin, whereas the INH platform had a higher rate of 
degree of agreement between genotypic and phenotypic AST in this study 5.7% (4/70). For fusidic acid, the 1928 platform had a VME rate of 1.9% 
suggests that genotypic AST may be a useful tool for testing antibiotic (4/206), while the INH showed a rate of 3.4% (7/206) (Paper I, Table 5). 
susceptibility in S. aureus, particularly in cases where phenotypic AST is not Consistent with our findings for the 1928 platform, Gordon et al. (2014) (72) 
feasible. The overall agreement of each workflow (1928 and INH) with the have reported VME rates of 1.4% for ciprofloxacin but a lower rate of 0.6% 
reference method was also assessed. The 1928 platform demonstrated a high for fusidic acid using bioinformatics workflows like BLASTn and tBLASTn. 
44 45 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
The difference in fusidic acid VMEs could stem from employing distinct (Paper I, Table 8). Both methods agreed in 97.9% (231/236, 95% CI 95.0–
algorithms, or as noted by Gordon et al., their study incorporated low-quality 99.2%) of cases in predicting the ST type of S. aureus, except for 19 isolates 
contigs in the analysis of fusidic acid to enhance prediction accuracy (72). where neither platform could determine the ST. Comparison of different 
MLST software in NGS analysis has shown different performance in 
The lower VME rate of the 1928 platform compared to INH was not limited to determining ST type (114). We also noted discrepancies between the two 
ciprofloxacin and fusidic acids. Upon further comparison of the discrepancies methods: three STs were detected by the 1928 platform but not by the INH, 
between each bioinformatics workflow and phenotypic AST, the 1928 and two STs were detected by the INH but not by the 1928 platform. These 
platform exhibited a lower combined VME rate (0.8%, 8/1006) compared to discrepancies may be due to differences in the algorithms used by the 
INH (1.5%, 15/1006). However, the 1928 platform had a slightly higher platforms, variations in database references, or the quality of the sequencing 
combined ME rate (0.2%) than INH (0.1%) (Paper I, Table 5). Both methods data.  
demonstrated high accuracy in analyzing antimicrobial susceptibility, with the  
1928 platform showing a comparative advantage in minimizing VME. One of the main requirements for the adaptation of WGS in infection control 
and public health is speed, as timely identification of infection agents can be 
Identifying virulence factors in bacterial infections, including S. aureus, is critical in the diagnosis and treatment of patients. Using the same 
important as these factors influence the infection’s severity and outcome. computational system, the estimated time required to analyze one bacterial 
Knowing the virulence factors in a specific S. aureus strain helps clinicians isolate with two FASTQ paired-end files using the INH workflow was 5–6 
make more informed decisions when selecting antibiotics and treatment (106). hours. In contrast, the 1928 platform completed the same analysis in just 15–
During the study, the clinical lab did not perform reference methods for 30 minutes. 
virulence factor identification. Therefore, we benchmarked the two 
bioinformatics workflows against each other, focusing on the genes included Another crucial factor is the reliability of results from bioinformatics 
in the 1928 platform.  workflows. Both workflows showed high agreement with clinical diagnoses 
for S. aureus. However, the INH workflow requires formal bioinformatics 
The 1928 platform was designed to identify critical virulence genes in S. support, which may make it more complex to implement in clinical settings 
aureus infections, including etA and etB, which produce exfoliative toxins compared to the 1928 platform, where users simply upload FASTQ files. In 
responsible for staphylococcal scalded skin syndrome (107), tsst1 associated contrast, the 1928 platform is limited to its built-in analyses, whereas the INH 
with toxic shock syndrome toxin-1 and severe symptoms in toxic shock pipeline offers the advantage of expansion with additional analyses from CGE. 
syndrome cases (108), and Panton-Valentine Leucocidin (PVL) exotoxin 
encoded by lukF-PVL and lukS-PVL genes, contributing significantly to 
infection severity and adverse outcomes in invasive diseases (109). The 
comparison between the 1928 platform and INH revealed a high level of 
agreement in predicting these specific virulence genes in S. aureus strains, 
achieving an overall agreement of 99.4% (1267/1275, 95% CI: 98.7-99.7) 
(Paper I, Table 7). However, it was observed that the 1928 platform uniquely 
identified more isolates harboring virulence genes etA (n=2), etB (n=2), and 
tsst1 (n=4) than INH (Paper I, Table 7). This highlights potential differences 
in the sensitivity or specificity of the two bioinformatics workflows for these 
certain virulence genes. 
MLST is a widely adopted method for bacterial typing, crucial for 
investigating outbreaks caused by various pathogens (110-113). We evaluated 
the 1928 platform and INH in predicting MLST types to assess their 
consistency. In our study, out of 255 isolates, 236 (92.5%, CI: 88.6-95.2) 
displayed consistent MLST types between the 1928 platform and the INH 
46 47 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
The difference in fusidic acid VMEs could stem from employing distinct (Paper I, Table 8). Both methods agreed in 97.9% (231/236, 95% CI 95.0–
algorithms, or as noted by Gordon et al., their study incorporated low-quality 99.2%) of cases in predicting the ST type of S. aureus, except for 19 isolates 
contigs in the analysis of fusidic acid to enhance prediction accuracy (72). where neither platform could determine the ST. Comparison of different 
MLST software in NGS analysis has shown different performance in 
The lower VME rate of the 1928 platform compared to INH was not limited to determining ST type (114). We also noted discrepancies between the two 
ciprofloxacin and fusidic acids. Upon further comparison of the discrepancies methods: three STs were detected by the 1928 platform but not by the INH, 
between each bioinformatics workflow and phenotypic AST, the 1928 and two STs were detected by the INH but not by the 1928 platform. These 
platform exhibited a lower combined VME rate (0.8%, 8/1006) compared to discrepancies may be due to differences in the algorithms used by the 
INH (1.5%, 15/1006). However, the 1928 platform had a slightly higher platforms, variations in database references, or the quality of the sequencing 
combined ME rate (0.2%) than INH (0.1%) (Paper I, Table 5). Both methods data.  
demonstrated high accuracy in analyzing antimicrobial susceptibility, with the  
1928 platform showing a comparative advantage in minimizing VME. One of the main requirements for the adaptation of WGS in infection control 
and public health is speed, as timely identification of infection agents can be 
Identifying virulence factors in bacterial infections, including S. aureus, is critical in the diagnosis and treatment of patients. Using the same 
important as these factors influence the infection’s severity and outcome. computational system, the estimated time required to analyze one bacterial 
Knowing the virulence factors in a specific S. aureus strain helps clinicians isolate with two FASTQ paired-end files using the INH workflow was 5–6 
make more informed decisions when selecting antibiotics and treatment (106). hours. In contrast, the 1928 platform completed the same analysis in just 15–
During the study, the clinical lab did not perform reference methods for 30 minutes. 
virulence factor identification. Therefore, we benchmarked the two 
bioinformatics workflows against each other, focusing on the genes included Another crucial factor is the reliability of results from bioinformatics 
in the 1928 platform.  workflows. Both workflows showed high agreement with clinical diagnoses 
for S. aureus. However, the INH workflow requires formal bioinformatics 
The 1928 platform was designed to identify critical virulence genes in S. support, which may make it more complex to implement in clinical settings 
aureus infections, including etA and etB, which produce exfoliative toxins compared to the 1928 platform, where users simply upload FASTQ files. In 
responsible for staphylococcal scalded skin syndrome (107), tsst1 associated contrast, the 1928 platform is limited to its built-in analyses, whereas the INH 
with toxic shock syndrome toxin-1 and severe symptoms in toxic shock pipeline offers the advantage of expansion with additional analyses from CGE. 
syndrome cases (108), and Panton-Valentine Leucocidin (PVL) exotoxin 
encoded by lukF-PVL and lukS-PVL genes, contributing significantly to 
infection severity and adverse outcomes in invasive diseases (109). The 
comparison between the 1928 platform and INH revealed a high level of 
agreement in predicting these specific virulence genes in S. aureus strains, 
achieving an overall agreement of 99.4% (1267/1275, 95% CI: 98.7-99.7) 
(Paper I, Table 7). However, it was observed that the 1928 platform uniquely 
identified more isolates harboring virulence genes etA (n=2), etB (n=2), and 
tsst1 (n=4) than INH (Paper I, Table 7). This highlights potential differences 
in the sensitivity or specificity of the two bioinformatics workflows for these 
certain virulence genes. 
MLST is a widely adopted method for bacterial typing, crucial for 
investigating outbreaks caused by various pathogens (110-113). We evaluated 
the 1928 platform and INH in predicting MLST types to assess their 
consistency. In our study, out of 255 isolates, 236 (92.5%, CI: 88.6-95.2) 
displayed consistent MLST types between the 1928 platform and the INH 
46 47 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4.2 PAPER II- EPIDEMIOLOGY AND There is no clear consensus on the correlation between gender and S. aureus 
ANTIBIOTIC RESISTANCE PATTERN  infection, with varying prevalence rates reported between males and females (122). Some studies have suggested that males may be at a higher risk of 
 community-acquired S. aureus infections (123, 124), which was also the case 
in our study. Similarly, there were differences between genders in terms of 
A major concern associated with S. aureus infections is the emergence of ODR isolates, but no significant differences were observed in terms of MRSA 
antibiotic resistance. MRSA is a particular concern as it resists many and MDR infections (Paper II, Table 1). 
commonly used antibiotics, complicating treatment and leading to increased 
morbidity, mortality, and healthcare costs (115).  Age has been associated with both the incidence and severity of S. aureus 
infection, with older adults having a higher rate than younger individuals. For 
Our aim in this study was to identify the epidemiology and resistance patterns example, Skogberg et al. (2012) reported that S. aureus bloodstream infections 
of S. aureus strains isolated in the Skaraborg sepsis study. We explored increased significantly in those aged 65 and older (125), and a study in 
laboratory records of 262 strains obtained from 212 patients (aged 18 to 97). Denmark similarly found higher infection rates in individuals aged 80 and 
above (124). Our study also identified the highest percentage of S. aureus 
Our study identified 1.1% (3/262) of S. aureus strains as MRSA, aligning with strains in those over 70 years old (Paper II, Figure 2). Interestingly, the average 
the 2021 Swedres-Svarm report from the Public Health Agency of Sweden and age of infection was higher in females (74 years) compared to males (69 years, 
the National Veterinary Institute, which recorded an MRSA prevalence of p=0.03) (Paper II, Table 1), which may be related to menopause and its impact 
1.1% in 2013 (116). Although MRSA prevalence in the Skaraborg region was on immune function, though further research is needed to explore this. 
relatively low, multidrug resistance to four or more antibiotics (MDR) was 
found in 3.4% of the strains, and resistance to one to three antibiotics (ODR) S. aureus is commonly found in the nasal passages and on the skin of healthy 
was observed in 76.7% of the strains (Paper II, Figure 1). This suggests that individuals (126, 127). In our study, nasal carriage of S. aureus strains was also 
while MRSA was not a major concern locally, other forms of antibiotic- prevalent, with many isolates obtained from upper respiratory tract specimens 
resistant S. aureus could pose a public health risk. (Paper II, Table 2), reinforcing the importance of nasal carriage as a reservoir 
for these bacteria.  
The treatment of S. aureus infections is primarily guided by the bacterial 
strain’s antibiotic resistance profile and the severity of the infection. According In summary, the region had a low prevalence of MRSA, with most strains 
to the Swedres-Svarm report, MRSA strains were resistant to clindamycin and resistant to one to three antibiotics, underscoring the ongoing challenge of 
erythromycin (116), however, our study found that these MRSA isolates were antibiotic resistance in S. aureus infections. The study also highlighted that age 
susceptible to clindamycin, erythromycin, and fusidic acid (Paper II, Table 3). and gender are significant factors, with higher prevalence in males and 
The discrepancy highlights variations in resistance patterns reported across individuals over 70 years old. 
different regions (117-119). Furthermore, our study found that vancomycin, a 
potent antibiotic commonly used to treat MRSA infections, was effective 
against all isolates (Paper II, Table 3). This result was consistent with the 
Swedres-Svarm report, which also indicated no vancomycin resistance among 
S. aureus isolates (116).   
Our findings demonstrated high resistance (>80%) of S. aureus strains to 
several commonly used antibiotics, including penicillin V (MRSA, MDR, and 
ODR strains), penicillin G, and piperacillin (both affecting MRSA and MDR 
strains), as well as isoxazolyl penicillin (MRSA strains). These results are 
consistent with other studies documenting high penicillin resistance rates 
among S. aureus strains across different geographical locations (120, 121).  
48 49 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4.2 PAPER II- EPIDEMIOLOGY AND There is no clear consensus on the correlation between gender and S. aureus 
ANTIBIOTIC RESISTANCE PATTERN  infection, with varying prevalence rates reported between males and females (122). Some studies have suggested that males may be at a higher risk of 
 community-acquired S. aureus infections (123, 124), which was also the case 
in our study. Similarly, there were differences between genders in terms of 
A major concern associated with S. aureus infections is the emergence of ODR isolates, but no significant differences were observed in terms of MRSA 
antibiotic resistance. MRSA is a particular concern as it resists many and MDR infections (Paper II, Table 1). 
commonly used antibiotics, complicating treatment and leading to increased 
morbidity, mortality, and healthcare costs (115).  Age has been associated with both the incidence and severity of S. aureus 
infection, with older adults having a higher rate than younger individuals. For 
Our aim in this study was to identify the epidemiology and resistance patterns example, Skogberg et al. (2012) reported that S. aureus bloodstream infections 
of S. aureus strains isolated in the Skaraborg sepsis study. We explored increased significantly in those aged 65 and older (125), and a study in 
laboratory records of 262 strains obtained from 212 patients (aged 18 to 97). Denmark similarly found higher infection rates in individuals aged 80 and 
above (124). Our study also identified the highest percentage of S. aureus 
Our study identified 1.1% (3/262) of S. aureus strains as MRSA, aligning with strains in those over 70 years old (Paper II, Figure 2). Interestingly, the average 
the 2021 Swedres-Svarm report from the Public Health Agency of Sweden and age of infection was higher in females (74 years) compared to males (69 years, 
the National Veterinary Institute, which recorded an MRSA prevalence of p=0.03) (Paper II, Table 1), which may be related to menopause and its impact 
1.1% in 2013 (116). Although MRSA prevalence in the Skaraborg region was on immune function, though further research is needed to explore this. 
relatively low, multidrug resistance to four or more antibiotics (MDR) was 
found in 3.4% of the strains, and resistance to one to three antibiotics (ODR) S. aureus is commonly found in the nasal passages and on the skin of healthy 
was observed in 76.7% of the strains (Paper II, Figure 1). This suggests that individuals (126, 127). In our study, nasal carriage of S. aureus strains was also 
while MRSA was not a major concern locally, other forms of antibiotic- prevalent, with many isolates obtained from upper respiratory tract specimens 
resistant S. aureus could pose a public health risk. (Paper II, Table 2), reinforcing the importance of nasal carriage as a reservoir 
for these bacteria.  
The treatment of S. aureus infections is primarily guided by the bacterial 
strain’s antibiotic resistance profile and the severity of the infection. According In summary, the region had a low prevalence of MRSA, with most strains 
to the Swedres-Svarm report, MRSA strains were resistant to clindamycin and resistant to one to three antibiotics, underscoring the ongoing challenge of 
erythromycin (116), however, our study found that these MRSA isolates were antibiotic resistance in S. aureus infections. The study also highlighted that age 
susceptible to clindamycin, erythromycin, and fusidic acid (Paper II, Table 3). and gender are significant factors, with higher prevalence in males and 
The discrepancy highlights variations in resistance patterns reported across individuals over 70 years old. 
different regions (117-119). Furthermore, our study found that vancomycin, a 
potent antibiotic commonly used to treat MRSA infections, was effective 
against all isolates (Paper II, Table 3). This result was consistent with the 
Swedres-Svarm report, which also indicated no vancomycin resistance among 
S. aureus isolates (116).   
Our findings demonstrated high resistance (>80%) of S. aureus strains to 
several commonly used antibiotics, including penicillin V (MRSA, MDR, and 
ODR strains), penicillin G, and piperacillin (both affecting MRSA and MDR 
strains), as well as isoxazolyl penicillin (MRSA strains). These results are 
consistent with other studies documenting high penicillin resistance rates 
among S. aureus strains across different geographical locations (120, 121).  
48 49 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4.3 PAPER III- IDENTIFYING A POSSIBLE Furthermore, the assessment of accuracy across datasets with 80% and 5% 
PROTEIN BIOMARKER PANEL missing values revealed that the performance remained comparable, even though the 80% dataset relied heavily on imputed data (Paper III, Figure 2). 
 This result highlights the effectiveness of our imputation method, GSimp, in 
addressing missing values—a common challenge in proteomics research. 
Over recent decades, distinct inflammatory biomarker patterns for gram-
positive and gram-negative bacterial sepsis have been proposed (128-131), Our study results showed that for the dataset with 40% missing data, 
presenting promising opportunities for faster diagnosis and targeted encompassing 285 proteins, Lasso achieved the highest classification accuracy 
treatments. These advancements have the potential to significantly reduce the at 71.2%, outperforming RF (63.0%) and RFE-LR (67.0%) (Paper III, Figure 
time required to initiate appropriate therapies, and improve patient outcomes, 2). Through Lasso regression, we identified 55 proteins as the most predictive 
in contrast to conventional blood cultures, which are time-consuming and may biomarkers, effectively distinguishing between the three groups: patients with 
delay treatment (132).     gram-positive bacterial infections, patients with gram-negative bacterial 
infections, and healthy controls. Evaluation of the performance of the 55 
This study aimed to assess whether a panel of protein blood biomarkers could selected proteins showed a perfect classification performance with an AUC of 
effectively differentiate between gram-positive and gram-negative bacteria 1.0 (95% CI: 0.549-0.771) for distinguishing bacterial infections from healthy 
directly from blood samples using PEA technology. We selected four Olink controls. However, the model exhibited only moderate performance for gram-
panels—cardiometabolic, immune response, inflammation, and cardiovascular positive infections (AUC: 0.66, 95% CI: 0.549-0.771) and gram-negative 
II— based on the involvement of sepsis in multiple physiological processes infections (AUC: 0.69, 95% CI: 0.586-0.794) (Paper III, Figure 4), indicating 
(24, 25).  that the 55 proteins may not be sufficient for accurate differentiation between 
these bacterial types. The low AUC values may be due to several factors. A 
PEA technology is highly sensitive and capable of multiplexed protein possible overlap in protein profiles between gram-positive and gram-negative 
detection (133), but this sensitivity often results in a considerable number of infections, due to inherent similarities, makes it challenging to distinguish 
missing values due to measurements falling below the limit of detection between these groups. Moreover, variability within patient groups, including 
(LOD). Therefore, it is crucial to handle these missing values appropriately to different bacterial types, may have affected the results. Finally, the complexity 
ensure accurate outcomes in subsequent statistical analyses and machine of sepsis and the limited range of proteins analyzed suggest that broader 
learning algorithms. In this study, we aimed to determine the optimal number proteomic approaches such as mass spectrometry might improve classification 
of missing values per protein and identify the most effective algorithm for our accuracy. 
data. To manage missing data, we used the GSimp imputation method, which 
estimates missing values based on the distribution of detected values above The selection of biomarkers can depend on the specifics of the study and the 
LOD (134). We created ten imputed datasets with varying levels of missing biological context. Zhang et al. (2017) (148) employed an RFE-LR model to 
data and compared the performance of three widely-used feature selection assess 49 blood biomarkers, including leukocytes and cytokines like IFNγ, in 
algorithms in identifying biological markers for discriminating patients with patients with peritonitis. Their model achieved an AUC of 0.993 for 
sepsis: RF (135-139), Lasso (140-142), and RFE (143-146). distinguishing gram-negative infections with eight biomarkers and 0.711 for 
gram-positive infections with five biomarkers. Notably, IFN-γ was among the 
Our study found that proteins with less than 40% missing values provided the 55 proteins identified in our study. Zhang’s research aimed to differentiate 
most accurate predictions, resulting in 285 relevant proteins for analysis. In between Streptococcal and Staphylococcal bacteria and gram-negative strains, 
contrast, both stringent filtration (<5%) and excessively permissive inclusion indicating the potential for species-specific biomarker profiles. However, our 
(<80%) significantly impaired prediction accuracy. This observation aligns study’s limited bacterial sample size restricted our ability to identify markers 
with the manufacturer’s recommended range of 25-50% missing values (147). unique to each bacterial species. Additionally, another study using ELISA and 
These findings offer valuable insights into balancing the retention of essential analyzing eight cytokines found elevated levels of IFN-γ, TNF-α, IL-1ra, and 
information with maintaining prediction accuracy. IL-10 in gram-negative infections (130), which were also identified in our 
research, assisting in the differentiation between gram-positive and gram-
50 51 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4.3 PAPER III- IDENTIFYING A POSSIBLE Furthermore, the assessment of accuracy across datasets with 80% and 5% 
PROTEIN BIOMARKER PANEL missing values revealed that the performance remained comparable, even though the 80% dataset relied heavily on imputed data (Paper III, Figure 2). 
 This result highlights the effectiveness of our imputation method, GSimp, in 
addressing missing values—a common challenge in proteomics research. 
Over recent decades, distinct inflammatory biomarker patterns for gram-
positive and gram-negative bacterial sepsis have been proposed (128-131), Our study results showed that for the dataset with 40% missing data, 
presenting promising opportunities for faster diagnosis and targeted encompassing 285 proteins, Lasso achieved the highest classification accuracy 
treatments. These advancements have the potential to significantly reduce the at 71.2%, outperforming RF (63.0%) and RFE-LR (67.0%) (Paper III, Figure 
time required to initiate appropriate therapies, and improve patient outcomes, 2). Through Lasso regression, we identified 55 proteins as the most predictive 
in contrast to conventional blood cultures, which are time-consuming and may biomarkers, effectively distinguishing between the three groups: patients with 
delay treatment (132).     gram-positive bacterial infections, patients with gram-negative bacterial 
infections, and healthy controls. Evaluation of the performance of the 55 
This study aimed to assess whether a panel of protein blood biomarkers could selected proteins showed a perfect classification performance with an AUC of 
effectively differentiate between gram-positive and gram-negative bacteria 1.0 (95% CI: 0.549-0.771) for distinguishing bacterial infections from healthy 
directly from blood samples using PEA technology. We selected four Olink controls. However, the model exhibited only moderate performance for gram-
panels—cardiometabolic, immune response, inflammation, and cardiovascular positive infections (AUC: 0.66, 95% CI: 0.549-0.771) and gram-negative 
II— based on the involvement of sepsis in multiple physiological processes infections (AUC: 0.69, 95% CI: 0.586-0.794) (Paper III, Figure 4), indicating 
(24, 25).  that the 55 proteins may not be sufficient for accurate differentiation between 
these bacterial types. The low AUC values may be due to several factors. A 
PEA technology is highly sensitive and capable of multiplexed protein possible overlap in protein profiles between gram-positive and gram-negative 
detection (133), but this sensitivity often results in a considerable number of infections, due to inherent similarities, makes it challenging to distinguish 
missing values due to measurements falling below the limit of detection between these groups. Moreover, variability within patient groups, including 
(LOD). Therefore, it is crucial to handle these missing values appropriately to different bacterial types, may have affected the results. Finally, the complexity 
ensure accurate outcomes in subsequent statistical analyses and machine of sepsis and the limited range of proteins analyzed suggest that broader 
learning algorithms. In this study, we aimed to determine the optimal number proteomic approaches such as mass spectrometry might improve classification 
of missing values per protein and identify the most effective algorithm for our accuracy. 
data. To manage missing data, we used the GSimp imputation method, which 
estimates missing values based on the distribution of detected values above The selection of biomarkers can depend on the specifics of the study and the 
LOD (134). We created ten imputed datasets with varying levels of missing biological context. Zhang et al. (2017) (148) employed an RFE-LR model to 
data and compared the performance of three widely-used feature selection assess 49 blood biomarkers, including leukocytes and cytokines like IFNγ, in 
algorithms in identifying biological markers for discriminating patients with patients with peritonitis. Their model achieved an AUC of 0.993 for 
sepsis: RF (135-139), Lasso (140-142), and RFE (143-146). distinguishing gram-negative infections with eight biomarkers and 0.711 for 
gram-positive infections with five biomarkers. Notably, IFN-γ was among the 
Our study found that proteins with less than 40% missing values provided the 55 proteins identified in our study. Zhang’s research aimed to differentiate 
most accurate predictions, resulting in 285 relevant proteins for analysis. In between Streptococcal and Staphylococcal bacteria and gram-negative strains, 
contrast, both stringent filtration (<5%) and excessively permissive inclusion indicating the potential for species-specific biomarker profiles. However, our 
(<80%) significantly impaired prediction accuracy. This observation aligns study’s limited bacterial sample size restricted our ability to identify markers 
with the manufacturer’s recommended range of 25-50% missing values (147). unique to each bacterial species. Additionally, another study using ELISA and 
These findings offer valuable insights into balancing the retention of essential analyzing eight cytokines found elevated levels of IFN-γ, TNF-α, IL-1ra, and 
information with maintaining prediction accuracy. IL-10 in gram-negative infections (130), which were also identified in our 
research, assisting in the differentiation between gram-positive and gram-
50 51 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
negative infections. Variations in the findings may be attributed to differences Conversely, some proteins exhibited weak negative correlations, with PCC 
in methodologies, the number of biomarkers analyzed, and study designs. values between 0 and -0.5. Notably, SAA4 (n=34), CFHR5 (n=32), CNDP1 
(n=32), GDF2 (n=29), and MBL2 (n=27) were the top five proteins exhibiting 
In our study, we aimed to assess the selection of predictive proteins made by the highest number of negative correlations. These negative correlations 
the Lasso algorithm in more detail. We evaluated the distribution and between the predictive proteins for gram-positive and gram-negative bacterial 
performance of these proteins using skewness and kurtosis as measures of infections suggest that fluctuations in their levels may reflect bacterial type.  
symmetry and peakedness, respectively. According to indicated criteria, a 
normal distribution includes skewness values from -2 to +2 and kurtosis from Our literature review revealed that Serum amyloid A-4 protein (SAA4) is a 
-7 to +7 (149, 150). In gram-positive bacterial infection patients, most proteins significant acute phase reactant and a potential biomarker for sepsis prognosis 
fell within these normal ranges, except for TNF, TNFRSF13B, and ADA, (151). Complement Factor H Related 5 (CFHR5) plays a role in complement 
which exhibited skewness (Paper III, Figure 6A) and kurtosis (Paper III, Figure regulation and, alongside other proteins, serves as a biosignature for 
6B). For gram-negative bacterial infection patients, most proteins also tuberculosis infections (152). Carnosine dipeptide 1 (CNDP1) may predict 
remained within normal skewness and kurtosis ranges, except for MFAP5, mortality in S. aureus infections (153). Low serum levels of Mannose Binding 
TNFRSF13B, and CD8A which showed deviations (Paper III, Figures 6A and Lectin 2 (MBL2) are associated with higher mortality in severe pneumococcal 
6B). We further investigated the impact of the five exception proteins— infections caused by Streptococcus pneumoniae (154-157). Growth 
MFAP5, TNF, TNFRSF13B, CD8A, and ADA—on the performance of the differentiation factor 2 (GDF2), also known as bone morphogenetic protein 
predictive set. The absence of these proteins led to a slight decrease in 9 (BMP9), is crucial for bone and cartilage development and angiogenesis, 
sensitivity and specificity for both gram-positive (AUC=0.61, 95% CI: 0.536- though it is not currently recognized as a sepsis biomarker (158-160). These 
0.684) and gram-negative bacterial septic patients (AUC=0.66, 95% CI: 0.588- findings suggest that these five proteins could be potential biomarkers for 
0.732) (Paper III, Figure 6C). However, this decrease was not statistically bacterial type identification in suspected patients of sepsis, though further 
significant, as the univariate analysis revealed that the levels of these proteins research is needed to fully understand their clinical relevance. 
were not markedly different between the two groups. Violin plots also 
indicated similar expression patterns, though fluctuations suggested a possible  
subpopulation effect in infection classification (Paper III, Figure 6D). 
Correlation analysis of the 55 predictive proteins using pairwise PCC in 
patients with gram-positive and gram-negative infections revealed a range of 
correlations (Paper III, Figure 7). Most proteins exhibited weak to moderate 
positive correlations (coefficients ranging from 0 to 0.5), suggesting shared 
expression patterns and potentially similar roles in bacterial response. This was 
further supported by network analysis using STRING and Cytoscape, which 
identified a core group of proteins—such as TNF, IL6, IL-1ra, IL10, IFN-γ, 
CD8A, CCL19, SELL, CCL17, CCL25, IL7R, CSF-1, and LEP—with 
extensive interactions and a significant PPI enrichment value (<1.0e-16), 
indicating a common expression pattern. Additionally, gene ontology analysis 
with the PANTHER database revealed that while 22 proteins did not map to 
specific biological processes, the remaining proteins were enriched in 
categories such as cellular processes, biological regulation, response to stimuli, 
signaling, immune system processes, and metabolic processes. Notably, 10 
proteins showed significant enrichment in interspecies interaction.  
52 53 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
negative infections. Variations in the findings may be attributed to differences Conversely, some proteins exhibited weak negative correlations, with PCC 
in methodologies, the number of biomarkers analyzed, and study designs. values between 0 and -0.5. Notably, SAA4 (n=34), CFHR5 (n=32), CNDP1 
(n=32), GDF2 (n=29), and MBL2 (n=27) were the top five proteins exhibiting 
In our study, we aimed to assess the selection of predictive proteins made by the highest number of negative correlations. These negative correlations 
the Lasso algorithm in more detail. We evaluated the distribution and between the predictive proteins for gram-positive and gram-negative bacterial 
performance of these proteins using skewness and kurtosis as measures of infections suggest that fluctuations in their levels may reflect bacterial type.  
symmetry and peakedness, respectively. According to indicated criteria, a 
normal distribution includes skewness values from -2 to +2 and kurtosis from Our literature review revealed that Serum amyloid A-4 protein (SAA4) is a 
-7 to +7 (149, 150). In gram-positive bacterial infection patients, most proteins significant acute phase reactant and a potential biomarker for sepsis prognosis 
fell within these normal ranges, except for TNF, TNFRSF13B, and ADA, (151). Complement Factor H Related 5 (CFHR5) plays a role in complement 
which exhibited skewness (Paper III, Figure 6A) and kurtosis (Paper III, Figure regulation and, alongside other proteins, serves as a biosignature for 
6B). For gram-negative bacterial infection patients, most proteins also tuberculosis infections (152). Carnosine dipeptide 1 (CNDP1) may predict 
remained within normal skewness and kurtosis ranges, except for MFAP5, mortality in S. aureus infections (153). Low serum levels of Mannose Binding 
TNFRSF13B, and CD8A which showed deviations (Paper III, Figures 6A and Lectin 2 (MBL2) are associated with higher mortality in severe pneumococcal 
6B). We further investigated the impact of the five exception proteins— infections caused by Streptococcus pneumoniae (154-157). Growth 
MFAP5, TNF, TNFRSF13B, CD8A, and ADA—on the performance of the differentiation factor 2 (GDF2), also known as bone morphogenetic protein 
predictive set. The absence of these proteins led to a slight decrease in 9 (BMP9), is crucial for bone and cartilage development and angiogenesis, 
sensitivity and specificity for both gram-positive (AUC=0.61, 95% CI: 0.536- though it is not currently recognized as a sepsis biomarker (158-160). These 
0.684) and gram-negative bacterial septic patients (AUC=0.66, 95% CI: 0.588- findings suggest that these five proteins could be potential biomarkers for 
0.732) (Paper III, Figure 6C). However, this decrease was not statistically bacterial type identification in suspected patients of sepsis, though further 
significant, as the univariate analysis revealed that the levels of these proteins research is needed to fully understand their clinical relevance. 
were not markedly different between the two groups. Violin plots also 
indicated similar expression patterns, though fluctuations suggested a possible  
subpopulation effect in infection classification (Paper III, Figure 6D). 
Correlation analysis of the 55 predictive proteins using pairwise PCC in 
patients with gram-positive and gram-negative infections revealed a range of 
correlations (Paper III, Figure 7). Most proteins exhibited weak to moderate 
positive correlations (coefficients ranging from 0 to 0.5), suggesting shared 
expression patterns and potentially similar roles in bacterial response. This was 
further supported by network analysis using STRING and Cytoscape, which 
identified a core group of proteins—such as TNF, IL6, IL-1ra, IL10, IFN-γ, 
CD8A, CCL19, SELL, CCL17, CCL25, IL7R, CSF-1, and LEP—with 
extensive interactions and a significant PPI enrichment value (<1.0e-16), 
indicating a common expression pattern. Additionally, gene ontology analysis 
with the PANTHER database revealed that while 22 proteins did not map to 
specific biological processes, the remaining proteins were enriched in 
categories such as cellular processes, biological regulation, response to stimuli, 
signaling, immune system processes, and metabolic processes. Notably, 10 
proteins showed significant enrichment in interspecies interaction.  
52 53 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4.4 PAPER IV- TRANSCRIPTOMIC MARKERS (Paper IV, Figure 2A & B). Filtering these three genes resulted in less visibility 
of subpopulations within each group, with the PCA analysis suggesting that 
 EIF1AY significantly influenced this change (Paper IV, Figure 2D). This gene 
appears to be expressed in the female group with E. coli-induced sepsis, 
In our previous study, we suggested that the host response to gram-negative suggesting a potential association between the gene’s expression and gender-
and gram-positive bacterial infections in adults suspected of sepsis might be specific responses to E. coli-induced sepsis, which warrants further 
species-specific. However, the limited sample sizes for unique species investigation. 
restricted our ability to explore this aspect comprehensively. We also proposed 
that analyzing whole blood biomarkers, rather than focusing solely on specific In our study, we aimed to perform a two-group classification analysis to 
biomarkers, could provide a more accurate model for understanding sepsis. distinguish between E. coli-induced sepsis and S. aureus-induced sepsis 
Additionally, concentrating exclusively on patients with confirmed sepsis— using the 25 predictive genes. The initial analysis yielded an AUC of 0.75, 
rather than those merely suspected of having the condition—helps to eliminate indicating an acceptable ability to distinguish between these two sepsis groups. 
the confounding effects associated with a mixed population of sepsis and non- However, after excluding three differentially expressed genes, the AUC 
sepsis patients. improved to 0.89, significantly enhancing the model’s discriminatory power. 
This improved AUC is comparable to the previously reported AUC of 0.8503 
Binding on these observations and suggestions, this study aimed to identify achieved with the Bayesian sparse factor classifier (58). Additionally, 
biomarkers that could differentiate between E. coli-induced sepsis, S. aureus- hierarchical heatmap clustering analysis revealed two distinct expression 
induced sepsis, and healthy individuals by examining the transcriptional patterns associated with E. coli-induced and S. aureus-induced sepsis, 
response in adults. We utilized gene expression profiles acquired through highlighting a clear separation between the responses to these pathogens. 
microarray technology and employed the power of the Lasso regression model Complementary PCC analysis showed weak inter-gene correlations, with most 
for analysis. values falling within the low positive (0 to 0.25) and low negative (-0.25 to 0) 
ranges, which aligns with the high predictive performance of the gene set and 
Our study found that 25 predictive genes from a pool of 22,277 genes suggests its effectiveness in distinguishing between the two types of sepsis. 
effectively distinguished E. coli- or S. aureus-induced sepsis or healthy 
controls. The model achieved a predictive accuracy of 80% with an MSE of Imbalanced data can lead to biased algorithm performance, favoring the 
0.20. The evaluation of the performance of the 25 genes using LR and AUC majority class and compromising the accuracy and generalizability of the 
yielded an AUC of 0.96 for distinguishing E. coli-induced sepsis, an AUC of model (161). Our analysis identified an underrepresentation of old males and 
0.98 for S. aureus-induced sepsis, and a perfect AUC of 1.0 for differentiating uneven distribution across groups, including E. coli-induced sepsis, S. aureus-
healthy controls from the other cases (Paper IV, Figure 1A). These findings induced sepsis, and healthy controls. To address this issue, we implemented an 
align with those of Ahn et al. (58), who reported high AUC values for upsampling technique and conducted a stability analysis. By applying a multi-
discriminating sepsis from healthy controls, with AUCs of 0.92 for E. coli- stage upsampling strategy, we increased the sample size from 94 to 151 and 
induced sepsis and 0.9898 for S. aureus-induced sepsis.  then to 228 while maintaining 22,277 genes. This approach allowed us to train 
and test our predictive genes using age-gender balanced samples and equal 
Further, unsupervised clustering analysis using a PCA plot revealed distinct sample sizes across all groups. As a result, model performance improved 
separation between the healthy control group and the infection-induced groups, significantly, with PCA plots showing enhanced separation among the groups 
though there was partial overlap between the E. coli- and S. aureus-induced and a perfect AUC of 1 across all groups. These findings demonstrate the 
sepsis groups, indicating similarities in gene expression or variability within model’s robustness across balanced datasets. Interestingly, after gender 
each group. PCA analysis also suggested the presence of subpopulations within balancing, the initial subpopulation identified by the PCA plot was softened, 
these groups (Paper IV, Figure 1B). Additional assessment of the 25 predictive providing valuable insights into the gender-related effects on susceptibility to 
genes using skewness and kurtosis metrics (with normal ranges defined as ±2 
for skewness and ±7 for kurtosis; Hair et al., 2010; Byrne, 2010) alongside the or development of sepsis. 
Mann-Whitney U test identified three differentially expressed genes—
EIF1AY, APOBEC3B, and GUSBP3—in the E. coli-induced sepsis group 
54 55 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
4.4 PAPER IV- TRANSCRIPTOMIC MARKERS (Paper IV, Figure 2A & B). Filtering these three genes resulted in less visibility 
of subpopulations within each group, with the PCA analysis suggesting that 
 EIF1AY significantly influenced this change (Paper IV, Figure 2D). This gene 
appears to be expressed in the female group with E. coli-induced sepsis, 
In our previous study, we suggested that the host response to gram-negative suggesting a potential association between the gene’s expression and gender-
and gram-positive bacterial infections in adults suspected of sepsis might be specific responses to E. coli-induced sepsis, which warrants further 
species-specific. However, the limited sample sizes for unique species investigation. 
restricted our ability to explore this aspect comprehensively. We also proposed 
that analyzing whole blood biomarkers, rather than focusing solely on specific In our study, we aimed to perform a two-group classification analysis to 
biomarkers, could provide a more accurate model for understanding sepsis. distinguish between E. coli-induced sepsis and S. aureus-induced sepsis 
Additionally, concentrating exclusively on patients with confirmed sepsis— using the 25 predictive genes. The initial analysis yielded an AUC of 0.75, 
rather than those merely suspected of having the condition—helps to eliminate indicating an acceptable ability to distinguish between these two sepsis groups. 
the confounding effects associated with a mixed population of sepsis and non- However, after excluding three differentially expressed genes, the AUC 
sepsis patients. improved to 0.89, significantly enhancing the model’s discriminatory power. 
This improved AUC is comparable to the previously reported AUC of 0.8503 
Binding on these observations and suggestions, this study aimed to identify achieved with the Bayesian sparse factor classifier (58). Additionally, 
biomarkers that could differentiate between E. coli-induced sepsis, S. aureus- hierarchical heatmap clustering analysis revealed two distinct expression 
induced sepsis, and healthy individuals by examining the transcriptional patterns associated with E. coli-induced and S. aureus-induced sepsis, 
response in adults. We utilized gene expression profiles acquired through highlighting a clear separation between the responses to these pathogens. 
microarray technology and employed the power of the Lasso regression model Complementary PCC analysis showed weak inter-gene correlations, with most 
for analysis. values falling within the low positive (0 to 0.25) and low negative (-0.25 to 0) 
ranges, which aligns with the high predictive performance of the gene set and 
Our study found that 25 predictive genes from a pool of 22,277 genes suggests its effectiveness in distinguishing between the two types of sepsis. 
effectively distinguished E. coli- or S. aureus-induced sepsis or healthy 
controls. The model achieved a predictive accuracy of 80% with an MSE of Imbalanced data can lead to biased algorithm performance, favoring the 
0.20. The evaluation of the performance of the 25 genes using LR and AUC majority class and compromising the accuracy and generalizability of the 
yielded an AUC of 0.96 for distinguishing E. coli-induced sepsis, an AUC of model (161). Our analysis identified an underrepresentation of old males and 
0.98 for S. aureus-induced sepsis, and a perfect AUC of 1.0 for differentiating uneven distribution across groups, including E. coli-induced sepsis, S. aureus-
healthy controls from the other cases (Paper IV, Figure 1A). These findings induced sepsis, and healthy controls. To address this issue, we implemented an 
align with those of Ahn et al. (58), who reported high AUC values for upsampling technique and conducted a stability analysis. By applying a multi-
discriminating sepsis from healthy controls, with AUCs of 0.92 for E. coli- stage upsampling strategy, we increased the sample size from 94 to 151 and 
induced sepsis and 0.9898 for S. aureus-induced sepsis.  then to 228 while maintaining 22,277 genes. This approach allowed us to train 
and test our predictive genes using age-gender balanced samples and equal 
Further, unsupervised clustering analysis using a PCA plot revealed distinct sample sizes across all groups. As a result, model performance improved 
separation between the healthy control group and the infection-induced groups, significantly, with PCA plots showing enhanced separation among the groups 
though there was partial overlap between the E. coli- and S. aureus-induced and a perfect AUC of 1 across all groups. These findings demonstrate the 
sepsis groups, indicating similarities in gene expression or variability within model’s robustness across balanced datasets. Interestingly, after gender 
each group. PCA analysis also suggested the presence of subpopulations within balancing, the initial subpopulation identified by the PCA plot was softened, 
these groups (Paper IV, Figure 1B). Additional assessment of the 25 predictive providing valuable insights into the gender-related effects on susceptibility to 
genes using skewness and kurtosis metrics (with normal ranges defined as ±2 
for skewness and ±7 for kurtosis; Hair et al., 2010; Byrne, 2010) alongside the or development of sepsis. 
Mann-Whitney U test identified three differentially expressed genes—
EIF1AY, APOBEC3B, and GUSBP3—in the E. coli-induced sepsis group 
54 55 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
We further evaluated the 25-gene model on two independent datasets to assess immune system responses and cytokine signaling possibly explaining the 
its generalizability and reliability. The first, GSE13015, comprised whole partial overlap in the PCA plot, while distinct pathways may reflect unique 
blood samples from adults with sepsis induced by E. coli or S. aureus, as well aspects or variations of each infection-induced sepsis group. Our findings 
as healthy controls. The model demonstrated robust performance, achieving support the idea that the body’s response to sepsis involves both a general 
AUC values of 0.79 for E. coli, 0.72 for S. aureus, and 0.87 for healthy immune response and a specific response to each bacterial group, as suggested 
controls, affirming its reliability in predicting sepsis-related conditions. In the by other researchers (162). In summary, exploring blood transcriptional 
second dataset, GSE65088, which represents a pre-sepsis stage (bacteremia) markers can aid in distinguishing between patients with E. coli and S. aureus-
and was used to validate the model’s predictive ability to differentiate between induced sepsis. Nevertheless, to determine clinical significance, it is essential 
E. coli and S. aureus infections, the model achieved an AUC of 0.62. Despite to conduct studies with larger cohorts and utilize more sophisticated 
differences in gene availability (23 in the first dataset and 24 in the second), algorithms. 
sample sizes (smaller in the validation datasets compared to the training set),  
and dataset homogeneity, the model’s performance remained impressive. 
 
In 2019, Chen et al. (162) aimed to investigate transcriptional biomarkers in E. 
coli-, and S. aureus-induced sepsis patients in a cross-sectional study.  
Nevertheless, their analysis specifically targeted the nine genes that 
consistently appeared in all datasets. Particularly, within this subset, LILRA5 
and TNFAIP6 were of significant attention due to their inclusion in the list of 
25 genes identified in our study. LILRA5 is recognized as a leukocyte 
immunoglobulin-like receptor and assumes an important role in modulating 
immune responses by engaging with various ligands. It has been proposed to 
possess functions in recognizing both viral and bacterial pathogens and in 
leading the inflammatory response, as supported by previous research (163-
165). Similarly, TNFAIP6, denoted as Tumor Necrosis Factor Alpha-Inducible 
Protein 6, has exhibited its involvement in inflammation and immunity. 
Studies have demonstrated that this protein, by affecting the production of pro-
inflammatory cytokines and chemokines, exerts regulatory control over 
immune cell activities, including macrophages (166). 
To further explore gene associations, we conducted PPI and pathway analysis. 
The PPI network analysis using STRING identified a sparse network with 22 
nodes and 6 edges but highlighted significant interactions among 8 proteins 
(IFIT1, IFI27, GBP1, FCGR1A, EIF1AY, DDX3Y, HIST1H1T (H1-6), and 
HIST1H1AD (H2AC7)) supported by a PPI enrichment p-value (< 0.0187) 
(Paper IV, Figure 5A), suggesting functional or physical connections among 
these proteins. Furthermore, the MCODE plug-in of Cytoscape clustered 
densely connected proteins, revealing a cluster containing IFI27, IFIT1, GBP1, 
and FCGR1A (Paper IV, Figure 5B), with functional associations also 
observed between EIF1AY and DDX3Y, as well as HIST1H1T and 
HIST1H1AD. However, other proteins did not exhibit associations, suggesting 
distinct pathways triggered by genes related to each bacterium. The pathway 
analysis revealed both shared and unique pathways, with shared pathways of 
56 57 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
We further evaluated the 25-gene model on two independent datasets to assess immune system responses and cytokine signaling possibly explaining the 
its generalizability and reliability. The first, GSE13015, comprised whole partial overlap in the PCA plot, while distinct pathways may reflect unique 
blood samples from adults with sepsis induced by E. coli or S. aureus, as well aspects or variations of each infection-induced sepsis group. Our findings 
as healthy controls. The model demonstrated robust performance, achieving support the idea that the body’s response to sepsis involves both a general 
AUC values of 0.79 for E. coli, 0.72 for S. aureus, and 0.87 for healthy immune response and a specific response to each bacterial group, as suggested 
controls, affirming its reliability in predicting sepsis-related conditions. In the by other researchers (162). In summary, exploring blood transcriptional 
second dataset, GSE65088, which represents a pre-sepsis stage (bacteremia) markers can aid in distinguishing between patients with E. coli and S. aureus-
and was used to validate the model’s predictive ability to differentiate between induced sepsis. Nevertheless, to determine clinical significance, it is essential 
E. coli and S. aureus infections, the model achieved an AUC of 0.62. Despite to conduct studies with larger cohorts and utilize more sophisticated 
differences in gene availability (23 in the first dataset and 24 in the second), algorithms. 
sample sizes (smaller in the validation datasets compared to the training set),  
and dataset homogeneity, the model’s performance remained impressive. 
 
In 2019, Chen et al. (162) aimed to investigate transcriptional biomarkers in E. 
coli-, and S. aureus-induced sepsis patients in a cross-sectional study.  
Nevertheless, their analysis specifically targeted the nine genes that 
consistently appeared in all datasets. Particularly, within this subset, LILRA5 
and TNFAIP6 were of significant attention due to their inclusion in the list of 
25 genes identified in our study. LILRA5 is recognized as a leukocyte 
immunoglobulin-like receptor and assumes an important role in modulating 
immune responses by engaging with various ligands. It has been proposed to 
possess functions in recognizing both viral and bacterial pathogens and in 
leading the inflammatory response, as supported by previous research (163-
165). Similarly, TNFAIP6, denoted as Tumor Necrosis Factor Alpha-Inducible 
Protein 6, has exhibited its involvement in inflammation and immunity. 
Studies have demonstrated that this protein, by affecting the production of pro-
inflammatory cytokines and chemokines, exerts regulatory control over 
immune cell activities, including macrophages (166). 
To further explore gene associations, we conducted PPI and pathway analysis. 
The PPI network analysis using STRING identified a sparse network with 22 
nodes and 6 edges but highlighted significant interactions among 8 proteins 
(IFIT1, IFI27, GBP1, FCGR1A, EIF1AY, DDX3Y, HIST1H1T (H1-6), and 
HIST1H1AD (H2AC7)) supported by a PPI enrichment p-value (< 0.0187) 
(Paper IV, Figure 5A), suggesting functional or physical connections among 
these proteins. Furthermore, the MCODE plug-in of Cytoscape clustered 
densely connected proteins, revealing a cluster containing IFI27, IFIT1, GBP1, 
and FCGR1A (Paper IV, Figure 5B), with functional associations also 
observed between EIF1AY and DDX3Y, as well as HIST1H1T and 
HIST1H1AD. However, other proteins did not exhibit associations, suggesting 
distinct pathways triggered by genes related to each bacterium. The pathway 
analysis revealed both shared and unique pathways, with shared pathways of 
56 57 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
5 CONCLUSION 6 FUTURE PERSPECTIVES 
  
Our research presents comprehensive findings across several key areas of Based on our findings, we plan to pursue several key areas of research. We aim 
sepsis diagnostics and treatment, each with its implications for clinical to further validate the reliability of genotypic antibiotic susceptibility testing 
practice. for identifying resistance in other types of bacteria causing sepsis by analyzing 
whole genome sequences. Additionally, we propose developing standardized 
Paper I demonstrates the potential of genotypic AST as a reliable tool for WGS workflows to ensure consistent and reliable results in clinical 
identifying antibiotic resistance. Despite this, the discrepancies observed diagnostics. Achieving this will involve collaborating with researchers and 
emphasize the need for careful validation and interpretation of bioinformatics clinicians to establish best practices and guidelines.  
results, particularly for critical antibiotics. Our study underscores the necessity 
of standardizing WGS workflows to ensure consistent and reliable results. Future studies will focus on identifying risk factors and potential transmission 
sources for antibiotic-resistant S. aureus strains across various settings, 
Paper II emphasizes the prevalence of S. aureus and antibiotic-resistant strains including healthcare facilities, community environments, and livestock. We 
among elderly individuals (>70 years), revealing a gender effect. The study observed that females contract S. aureus strains at a higher average age 
highlights swab samples as a major reservoir for S. aureus, and nasal carriage compared to males, prompting further investigation into potential underlying 
being a notable risk factor for infections. The persistence of resistance to factors such as hormonal changes or immune responses. Evaluating the 
certain antibiotics emphasizes the need for ongoing surveillance and adaptation effectiveness of current infection prevention and control measures in 
of treatment protocols. healthcare and community settings will also be a priority, including practices 
like hand hygiene and environmental cleaning. 
Paper III identifies potential predictive proteins that could differentiate 
between gram-positive and gram-negative infections, with five candidate In proteome analysis, we will employ techniques such as mass spectrometry or 
biomarkers emerging from fifty-five proteins studied. This research suggests explore additional protein panels to identify more biomarkers for 
that linear approaches can be valuable for mining complex biomedical datasets. distinguishing bacterial sepsis. The five candidate biomarkers identified, 
However, the study’s limitations, including a small sample size and a focus on including a newly discovered putative biomarker, could be further validated 
specific protein panels, indicate the need for further validation and exploration through experimental studies to confirm their diagnostic potential and 
of additional biomarkers. understand their role in differentiating between gram-positive and gram-
negative bacterial infections. 
Paper IV supports the dual-level response model of sepsis, comprising a 
general immune response and a more specific reaction to different bacterial We will also expand our research to assess the applicability of our 25-gene 
groups. The study also suggests that gender may influence the etiology of model in real-world clinical settings and plan to use the same approach to test 
sepsis, potentially affecting the type of bacteria responsible for the infection. for other types of bacteria that induce sepsis. In parallel, we will investigate 
The application of machine learning techniques in identifying predictive genes the dual-level immune response to sepsis, examining both the general immune 
demonstrates their potential for advancing sepsis diagnostics, with a thorough response and the specific reactions to different bacterial groups. Lastly, we 
evaluation of genes and metrics needed to refine the models further. plan to explore the role of gender in sepsis etiology, focusing on how gender 
interacts with bacterial pathogens responsible for sepsis infections. 
Overall, these studies collectively advance our understanding of sepsis 
diagnostics and treatment, highlighting the importance of continued research  
and methodological refinement to improve clinical outcomes. 
 
 
58 59 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
5 CONCLUSION 6 FUTURE PERSPECTIVES 
  
Our research presents comprehensive findings across several key areas of Based on our findings, we plan to pursue several key areas of research. We aim 
sepsis diagnostics and treatment, each with its implications for clinical to further validate the reliability of genotypic antibiotic susceptibility testing 
practice. for identifying resistance in other types of bacteria causing sepsis by analyzing 
whole genome sequences. Additionally, we propose developing standardized 
Paper I demonstrates the potential of genotypic AST as a reliable tool for WGS workflows to ensure consistent and reliable results in clinical 
identifying antibiotic resistance. Despite this, the discrepancies observed diagnostics. Achieving this will involve collaborating with researchers and 
emphasize the need for careful validation and interpretation of bioinformatics clinicians to establish best practices and guidelines.  
results, particularly for critical antibiotics. Our study underscores the necessity 
of standardizing WGS workflows to ensure consistent and reliable results. Future studies will focus on identifying risk factors and potential transmission 
sources for antibiotic-resistant S. aureus strains across various settings, 
Paper II emphasizes the prevalence of S. aureus and antibiotic-resistant strains including healthcare facilities, community environments, and livestock. We 
among elderly individuals (>70 years), revealing a gender effect. The study observed that females contract S. aureus strains at a higher average age 
highlights swab samples as a major reservoir for S. aureus, and nasal carriage compared to males, prompting further investigation into potential underlying 
being a notable risk factor for infections. The persistence of resistance to factors such as hormonal changes or immune responses. Evaluating the 
certain antibiotics emphasizes the need for ongoing surveillance and adaptation effectiveness of current infection prevention and control measures in 
of treatment protocols. healthcare and community settings will also be a priority, including practices 
like hand hygiene and environmental cleaning. 
Paper III identifies potential predictive proteins that could differentiate 
between gram-positive and gram-negative infections, with five candidate In proteome analysis, we will employ techniques such as mass spectrometry or 
biomarkers emerging from fifty-five proteins studied. This research suggests explore additional protein panels to identify more biomarkers for 
that linear approaches can be valuable for mining complex biomedical datasets. distinguishing bacterial sepsis. The five candidate biomarkers identified, 
However, the study’s limitations, including a small sample size and a focus on including a newly discovered putative biomarker, could be further validated 
specific protein panels, indicate the need for further validation and exploration through experimental studies to confirm their diagnostic potential and 
of additional biomarkers. understand their role in differentiating between gram-positive and gram-
negative bacterial infections. 
Paper IV supports the dual-level response model of sepsis, comprising a 
general immune response and a more specific reaction to different bacterial We will also expand our research to assess the applicability of our 25-gene 
groups. The study also suggests that gender may influence the etiology of model in real-world clinical settings and plan to use the same approach to test 
sepsis, potentially affecting the type of bacteria responsible for the infection. for other types of bacteria that induce sepsis. In parallel, we will investigate 
The application of machine learning techniques in identifying predictive genes the dual-level immune response to sepsis, examining both the general immune 
demonstrates their potential for advancing sepsis diagnostics, with a thorough response and the specific reactions to different bacterial groups. Lastly, we 
evaluation of genes and metrics needed to refine the models further. plan to explore the role of gender in sepsis etiology, focusing on how gender 
interacts with bacterial pathogens responsible for sepsis infections. 
Overall, these studies collectively advance our understanding of sepsis 
diagnostics and treatment, highlighting the importance of continued research  
and methodological refinement to improve clinical outcomes. 
 
 
58 59 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
ACKNOWLEDGEMENT I would like to express my deepest gratitude to my family, especially my mom, Iran. You have been not only my mother but also my friend and my guiding 
 light. I am grateful for your endless patience with my complaints and bad temper. I’m truly sorry for burdening you with my stress, and for the anxiety I 
I would like to sincerely thank everyone who helped and supported me. A lot caused you when my health struggles took over. Mom, if it weren’t for your 
of people have contributed to the work in this thesis. encouragement, I would have given up long ago. 
Last but not least, I want to thank my dear friends—Atousa, Azar, Beny, 
First and foremost, I want to thank myself—both my body and soul—for the Daniel, Elnaz, Javad, Marjan, Mikael, Roghi, and Shahrum—for your 
patience, resilience, and strength to endure the journey of this thesis. I am unwavering moral support throughout my journey. We laughed and cried 
deeply grateful for the perseverance through the long hours, countless together, sharing both the highs and the lows. Your encouragement and 
revisions, and the challenges that arose along the way. I also owe an apology understanding have been truly invaluable.  
to my body and soul for the strain and injuries caused throughout this process. 
Your endurance through the exhaustion and stress is something I truly To everyone else who has supported me along the way, even if your name isn’t 
appreciate, and I promise to take better care of you in the future. mentioned here, please know that I am deeply grateful. If I’ve missed anyone, 
I sincerely apologize—your presence in my life has meant so much. 
I am profoundly grateful to my supervisor, Astrid von Mentzer (University  
of Gothenburg), and my co-supervisors Anders Ståhlberg (University of Images in the thesis have been created with BioRender.com 
Gothenburg), Ka-Wei Tang (University of Gothenburg), Mikael Ejdebäk  
(University of Skövde). Special thanks to my co-supervisor Erik Kristiansson  
(Chalmers University); for his exceptional guidance and unwavering support  
throughout this journey. Without his dedication and commitment to my  
academic growth, this PhD would not have progressed as it did. I am deeply  
grateful for his contributions to my work and for helping me overcome  
challenges along the way.  
 
I would like to extend my gratitude to Andreas Tilevik (University of Skövde)  
for his technical support, which was crucial in developing the workflow for the  
WGS paper. I also wish to thank Diana Tilevik (University of Skövde), Anna-  
Karin Pernestig (University of Skövde), and Helena Enroth (previously at 
Unilab and University of Skövde) for their work in designing the WGS and 
Proteomic projects.  
To my colleagues and staff at the Division of Biology and Bioinformatics; 
University of Skövde, the Institute of Biomedicine; University of Gothenburg, 
and the Umeå Plant Science Centre, especially John Baxter, Anne Uv, Erik 
Lekholm, Peter Kindgren and Nicolas Delhomme, thank you for your 
support. 
I am also grateful to 1928 Diagnostics AB, TATAA Biocenter AB, Olink 
Proteomics AB, and SciLifeLab for providing technologies for the WGS and 
Proteomics projects. 
60 61 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
ACKNOWLEDGEMENT I would like to express my deepest gratitude to my family, especially my mom, Iran. You have been not only my mother but also my friend and my guiding 
 light. I am grateful for your endless patience with my complaints and bad temper. I’m truly sorry for burdening you with my stress, and for the anxiety I 
I would like to sincerely thank everyone who helped and supported me. A lot caused you when my health struggles took over. Mom, if it weren’t for your 
of people have contributed to the work in this thesis. encouragement, I would have given up long ago. 
Last but not least, I want to thank my dear friends—Atousa, Azar, Beny, 
First and foremost, I want to thank myself—both my body and soul—for the Daniel, Elnaz, Javad, Marjan, Mikael, Roghi, and Shahrum—for your 
patience, resilience, and strength to endure the journey of this thesis. I am unwavering moral support throughout my journey. We laughed and cried 
deeply grateful for the perseverance through the long hours, countless together, sharing both the highs and the lows. Your encouragement and 
revisions, and the challenges that arose along the way. I also owe an apology understanding have been truly invaluable.  
to my body and soul for the strain and injuries caused throughout this process. 
Your endurance through the exhaustion and stress is something I truly To everyone else who has supported me along the way, even if your name isn’t 
appreciate, and I promise to take better care of you in the future. mentioned here, please know that I am deeply grateful. If I’ve missed anyone, 
I sincerely apologize—your presence in my life has meant so much. 
I am profoundly grateful to my supervisor, Astrid von Mentzer (University  
of Gothenburg), and my co-supervisors Anders Ståhlberg (University of Images in the thesis have been created with BioRender.com 
Gothenburg), Ka-Wei Tang (University of Gothenburg), Mikael Ejdebäk  
(University of Skövde). Special thanks to my co-supervisor Erik Kristiansson  
(Chalmers University); for his exceptional guidance and unwavering support  
throughout this journey. Without his dedication and commitment to my  
academic growth, this PhD would not have progressed as it did. I am deeply  
grateful for his contributions to my work and for helping me overcome  
challenges along the way.  
 
I would like to extend my gratitude to Andreas Tilevik (University of Skövde)  
for his technical support, which was crucial in developing the workflow for the  
WGS paper. I also wish to thank Diana Tilevik (University of Skövde), Anna-  
Karin Pernestig (University of Skövde), and Helena Enroth (previously at 
Unilab and University of Skövde) for their work in designing the WGS and 
Proteomic projects.  
To my colleagues and staff at the Division of Biology and Bioinformatics; 
University of Skövde, the Institute of Biomedicine; University of Gothenburg, 
and the Umeå Plant Science Centre, especially John Baxter, Anne Uv, Erik 
Lekholm, Peter Kindgren and Nicolas Delhomme, thank you for your 
support. 
I am also grateful to 1928 Diagnostics AB, TATAA Biocenter AB, Olink 
Proteomics AB, and SciLifeLab for providing technologies for the WGS and 
Proteomics projects. 
60 61 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
REFERENCES 15. Dolin HH, Papadimos TJ, Chen X, Pan ZK. Characterization of Pathogenic Sepsis Etiologies and Patient Profiles: A Novel Approach to 
Triage and Treatment. Microbiol Insights. 2019;12:1178636118825081. 
1. Geroulanos S, Douka ET. Historical perspective of the word "sepsis". 16. World Health Organization. Improving the prevention, diagnosis and 
Intensive Care Med. 2006;32(12):2077. clinical management of sepsis. 2017 April 13. Report No.: A70/13. 
2. Singh S, Evans TW. Organ dysfunction during sepsis. Intensive Care 17. Rhee C, Jones TM, Hamad Y, Pande A, Varon J, O’Brien C, et al. 
Med. 2006;32(3):349-60. Prevalence, Underlying Causes, and Preventability of Sepsis-Associated 
3. Gyawali B, Ramakrishna K, Dhamoon AS. Sepsis: The evolution in Mortality in US Acute Care Hospitals. JAMA Network Open. 
definition, pathophysiology, and management. SAGE Open Med. 2019;2(2). 
2019;7:2050312119835043. 18. Webb SA, Kahler CM. Bench-to-bedside review: Bacterial virulence 
4. Yipp BG, Winston BW. Sepsis without SIRS is still sepsis. Ann Transl and subversion of host defences. Crit Care. 2008;12(6):234. 
Med. 2015;3(19):294. 19. Gabarin RS, Li M, Zimmel PA, Marshall JC, Li Y, Zhang H. 
5. Gul F, Arslantas MK, Cinel I, Kumar A. Changing Definitions of Sepsis. Intracellular and Extracellular Lipopolysaccharide Signaling in Sepsis: 
Turk J Anaesthesiol Reanim. 2017;45(3):129-38. Avenues for Novel Therapeutic Strategies. J Innate Immun. 
6. Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, 2021;13(6):323-32. 
Bauer M, et al. The Third International Consensus Definitions for Sepsis 20. Wang M, Feng J, Zhou D, Wang J. Bacterial lipopolysaccharide-
and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801-10. induced endothelial activation and dysfunction: a new predictive and 
7. Ljungström L, SteInum O, Brink M, Gårdlund B, Martner J, Sjölin J. therapeutic paradigm for sepsis. Eur J Med Res. 2023;28(1):339. 
Diagnostik och diagnoskodning av svår sepsis och septisk chock. ICD10 21. Dinges M, Orwin P, Schlievert P. Exotoxins of Staphylococcus aureus. 
bör kompletteras med tilläggskoder Läkartidning 2011. Clin Microbiol Rev. 2000;13(1):16-34. 
8. Andersson M, Brink M, Cronqvist J, Furebring M, Gille-Johnson P, 22. Thomas D, Dauwalder O, Brun V, Badiou C, Ferry T, Etienne J, et al. 
Ljungström L, et al. Sepsis och septisk chock, tidig identifiering och Staphylococcus aureus superantigens elicit redundant and extensive 
initial handläggning. Svenska Infektionsläkarföreningen; 2018. human Vbeta patterns. Infect Immun. 2009;77(5):2043-50. 
9. Rudd KE, Johnson SC, Agesa KM, Shackelford KA, Tsoi D, Kievlan 23. Atanasova KR. Interactions between porcine respiratory coronavirus 
DR, et al. Global, regional, and national sepsis incidence and mortality, and bacterial cell wall toxins in the lungs of pigs: Ghent University; 
1990–2017: analysis for the Global Burden of Disease Study. The 2010. 
Lancet. 2020;395(10219):200-11. 24. Chang JC. Sepsis and septic shock: endothelial molecular pathogenesis 
10. Fleischmann-Struzek C, Mellhammar L, Rose N, Cassini A, Rudd KE, associated with vascular microthrombotic disease. Thromb J. 
Schlattmann P, et al. Incidence and mortality of hospital- and ICU- 2019;17:10. 
treated sepsis: results from an updated and expanded systematic review 25. Nedeva C, Menassa J, Puthalakath H. Sepsis: Inflammation Is a 
and meta-analysis. Intensive Care Medicine. 2020;46(8):1552-62. Necessary Evil. Frontiers in Cell and Developmental Biology. 
11. Seree-aphinan C, Vichitkunakorn P, Navakanitworakul R, Khwannimit 2019;7(108). 
B. Distinguishing Sepsis From Infection by Neutrophil Dysfunction: A 26. Huang M, Cai S, Su J. The Pathogenesis of Sepsis and Potential 
Promising Role of CXCR2 Surface Level. Frontiers in Immunology. Therapeutic Targets. Int J Mol Sci. 2019;20(21). 
2020;11. 27. Spapen HD, Jacobs R, Honoré PM. Sepsis-induced multi-organ 
12. Rhee C, Dantes R, Epstein L, Murphy D, Seymour C, Iwashyna T, et al. dysfunction syndrome—a mechanistic approach. Journal of Emergency 
Incidence and Trends of Sepsis in US Hospitals Using Clinical vs and Critical Care Medicine. 2017;1(10):27-. 
Claims Data, 2009-2014. JAMA. 2017;3;318(13):1241-9. 28. Sagy M, Al-Qaqaa Y, Kim P. Definitions and pathophysiology of sepsis. 
13. Lengquist M, Lundberg OHM, Spångfors M, Annborn M, Levin H, Curr Probl Pediatr Adolesc Health Care. 2013;43(10):260-3. 
Friberg H, et al. Sepsis is underreported in Swedish intensive care units: 29. Chousterman BG, Swirski FK, Weber GF. Cytokine storm and sepsis 
A retrospective observational multicentre study. Acta Anaesthesiologica disease pathogenesis. Seminars in Immunopathology. 2017;39(5):517-
Scandinavica. 2020;64(8):1167-76. 28. 
14. Janeway CJ, Travers P, Walport M. The front line of host defense. 30. Ramachandran G. Gram-positive and gram-negative bacterial toxins in 
Immunobiology: The Immune System in Health and Disease. 5th edition sepsis: a brief review. Virulence. 2014;5(1):213-8. 
ed. New York: Garland Science; 2001. 
62 63 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
REFERENCES 15. Dolin HH, Papadimos TJ, Chen X, Pan ZK. Characterization of Pathogenic Sepsis Etiologies and Patient Profiles: A Novel Approach to 
Triage and Treatment. Microbiol Insights. 2019;12:1178636118825081. 
1. Geroulanos S, Douka ET. Historical perspective of the word "sepsis". 16. World Health Organization. Improving the prevention, diagnosis and 
Intensive Care Med. 2006;32(12):2077. clinical management of sepsis. 2017 April 13. Report No.: A70/13. 
2. Singh S, Evans TW. Organ dysfunction during sepsis. Intensive Care 17. Rhee C, Jones TM, Hamad Y, Pande A, Varon J, O’Brien C, et al. 
Med. 2006;32(3):349-60. Prevalence, Underlying Causes, and Preventability of Sepsis-Associated 
3. Gyawali B, Ramakrishna K, Dhamoon AS. Sepsis: The evolution in Mortality in US Acute Care Hospitals. JAMA Network Open. 
definition, pathophysiology, and management. SAGE Open Med. 2019;2(2). 
2019;7:2050312119835043. 18. Webb SA, Kahler CM. Bench-to-bedside review: Bacterial virulence 
4. Yipp BG, Winston BW. Sepsis without SIRS is still sepsis. Ann Transl and subversion of host defences. Crit Care. 2008;12(6):234. 
Med. 2015;3(19):294. 19. Gabarin RS, Li M, Zimmel PA, Marshall JC, Li Y, Zhang H. 
5. Gul F, Arslantas MK, Cinel I, Kumar A. Changing Definitions of Sepsis. Intracellular and Extracellular Lipopolysaccharide Signaling in Sepsis: 
Turk J Anaesthesiol Reanim. 2017;45(3):129-38. Avenues for Novel Therapeutic Strategies. J Innate Immun. 
6. Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, 2021;13(6):323-32. 
Bauer M, et al. The Third International Consensus Definitions for Sepsis 20. Wang M, Feng J, Zhou D, Wang J. Bacterial lipopolysaccharide-
and Septic Shock (Sepsis-3). JAMA. 2016;315(8):801-10. induced endothelial activation and dysfunction: a new predictive and 
7. Ljungström L, SteInum O, Brink M, Gårdlund B, Martner J, Sjölin J. therapeutic paradigm for sepsis. Eur J Med Res. 2023;28(1):339. 
Diagnostik och diagnoskodning av svår sepsis och septisk chock. ICD10 21. Dinges M, Orwin P, Schlievert P. Exotoxins of Staphylococcus aureus. 
bör kompletteras med tilläggskoder Läkartidning 2011. Clin Microbiol Rev. 2000;13(1):16-34. 
8. Andersson M, Brink M, Cronqvist J, Furebring M, Gille-Johnson P, 22. Thomas D, Dauwalder O, Brun V, Badiou C, Ferry T, Etienne J, et al. 
Ljungström L, et al. Sepsis och septisk chock, tidig identifiering och Staphylococcus aureus superantigens elicit redundant and extensive 
initial handläggning. Svenska Infektionsläkarföreningen; 2018. human Vbeta patterns. Infect Immun. 2009;77(5):2043-50. 
9. Rudd KE, Johnson SC, Agesa KM, Shackelford KA, Tsoi D, Kievlan 23. Atanasova KR. Interactions between porcine respiratory coronavirus 
DR, et al. Global, regional, and national sepsis incidence and mortality, and bacterial cell wall toxins in the lungs of pigs: Ghent University; 
1990–2017: analysis for the Global Burden of Disease Study. The 2010. 
Lancet. 2020;395(10219):200-11. 24. Chang JC. Sepsis and septic shock: endothelial molecular pathogenesis 
10. Fleischmann-Struzek C, Mellhammar L, Rose N, Cassini A, Rudd KE, associated with vascular microthrombotic disease. Thromb J. 
Schlattmann P, et al. Incidence and mortality of hospital- and ICU- 2019;17:10. 
treated sepsis: results from an updated and expanded systematic review 25. Nedeva C, Menassa J, Puthalakath H. Sepsis: Inflammation Is a 
and meta-analysis. Intensive Care Medicine. 2020;46(8):1552-62. Necessary Evil. Frontiers in Cell and Developmental Biology. 
11. Seree-aphinan C, Vichitkunakorn P, Navakanitworakul R, Khwannimit 2019;7(108). 
B. Distinguishing Sepsis From Infection by Neutrophil Dysfunction: A 26. Huang M, Cai S, Su J. The Pathogenesis of Sepsis and Potential 
Promising Role of CXCR2 Surface Level. Frontiers in Immunology. Therapeutic Targets. Int J Mol Sci. 2019;20(21). 
2020;11. 27. Spapen HD, Jacobs R, Honoré PM. Sepsis-induced multi-organ 
12. Rhee C, Dantes R, Epstein L, Murphy D, Seymour C, Iwashyna T, et al. dysfunction syndrome—a mechanistic approach. Journal of Emergency 
Incidence and Trends of Sepsis in US Hospitals Using Clinical vs and Critical Care Medicine. 2017;1(10):27-. 
Claims Data, 2009-2014. JAMA. 2017;3;318(13):1241-9. 28. Sagy M, Al-Qaqaa Y, Kim P. Definitions and pathophysiology of sepsis. 
13. Lengquist M, Lundberg OHM, Spångfors M, Annborn M, Levin H, Curr Probl Pediatr Adolesc Health Care. 2013;43(10):260-3. 
Friberg H, et al. Sepsis is underreported in Swedish intensive care units: 29. Chousterman BG, Swirski FK, Weber GF. Cytokine storm and sepsis 
A retrospective observational multicentre study. Acta Anaesthesiologica disease pathogenesis. Seminars in Immunopathology. 2017;39(5):517-
Scandinavica. 2020;64(8):1167-76. 28. 
14. Janeway CJ, Travers P, Walport M. The front line of host defense. 30. Ramachandran G. Gram-positive and gram-negative bacterial toxins in 
Immunobiology: The Immune System in Health and Disease. 5th edition sepsis: a brief review. Virulence. 2014;5(1):213-8. 
ed. New York: Garland Science; 2001. 
62 63 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
31. Raetz CR, Whitfield C. Lipopolysaccharide endotoxins. Annu Rev 45. Bloos F, Reinhart K. Rapid diagnosis of sepsis. Virulence. 
Biochem. 2002;71:635-700. 2014;5(1):154-60. 
32. Rello J, Valenzuela-Sanchez F, Ruiz-Rodriguez M, Moyano S. Sepsis: 46. Faix JD. Biomarkers of sepsis. Crit Rev Clin Lab Sci. 2013;50(1):23-
A Review of Advances in Management. Adv Ther. 2017;34(11):2393- 36. 
411. 47. Irani-Shemirani M. Biomarkers Approach in the Diagnosis and 
33. Pop-Began V, Păunescu V, Grigorean V, Pop-Began D, Popescu C. Prognosis of Sepsis. International Journal of Public Health Research. 
Molecular Mechanism in the Pathogenesis of Sepsis. J Med Life. 2022;12:1617-24. 
2014(2):4. 48. Lopez-Castejon G, Brough D. Understanding the mechanism of IL-
34. Yang L, Lin Y, Wang J, Song J, Wei B, Zhang X, et al. Comparison of 1beta secretion. Cytokine Growth Factor Rev. 2011;22(4):189-95. 
Clinical Characteristics and Outcomes Between Positive and Negative 49. Bozza FA, Salluh JI, Japiassu AM, Soares M, Assis EF, Gomes RN, et 
Blood Culture Septic Patients: A Retrospective Cohort Study. Infect al. Cytokine profiles as markers of disease severity in sepsis: a multiplex 
Drug Resist. 2021;14:4191-205. analysis. Crit Care. 2007;11(2):R49. 
35. Previsdomini M, Gini M, Cerutti B, Dolina M, Perren A. Predictors of 50. Morrow KN, Coopersmith CM, Ford ML. IL-17, IL-27, and IL-33: A 
positive blood cultures in critically ill patients: a retrospective Novel Axis Linked to Immunological Dysfunction During Sepsis. Front 
evaluation. Croat Med J. 2012;53(1):30-9. Immunol. 2019;10:1982. 
36. Luethy PM, Johnson JK. The Use of Matrix-Assisted Laser 51. Cao J, Xu F, Lin S, Song Z, Zhang L, Luo P, et al. IL-27 controls sepsis-
Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI- induced impairment of lung antibacterial host defence. Thorax. 
TOF MS) for the Identification of Pathogens Causing Sepsis. J Appl Lab 2014;69(10):926-37. 
Med. 2019;3(4):675-85. 52. Wang Y, Zhao J, Yao Y, Zhao D, Liu S. Interleukin-27 as a Diagnostic 
37. Peters RPH, van Agtmael MA, Danner SA, Savelkoul PHM, Biomarker for Patients with Sepsis: A Meta-Analysis. Biomed Res Int. 
Vandenbroucke-Grauls CMJE. New developments in the diagnosis of 2021;2021:5516940. 
bloodstream infections. The Lancet Infectious Diseases. 53. Wang JF, Yu ML, Yu G, Bian JJ, Deng XM, Wan XJ, et al. Serum miR-
2004;4(12):751-60. 146a and miR-223 as potential new biomarkers for sepsis. Biochem 
38. Opota O, Jaton K, Greub G. Microbial diagnosis of bloodstream Biophys Res Commun. 2010;394(1):184-8. 
infection: towards molecular diagnosis directly from blood. Clin 54. Shen X, Zhang J, Huang Y, Tong J, Zhang L, Zhang Z, et al. Accuracy 
Microbiol Infect. 2015;21(4):323-31. of circulating microRNAs in diagnosis of sepsis: a systematic review 
39. Lee T, Pang S, Stegger M, Sahibzada S, Abraham S, Daley D, et al. A and meta-analysis. J Intensive Care. 2020;8(1):84. 
three-year whole genome sequencing perspective of Enterococcus 55. Tagini F, Greub G. Bacterial genome sequencing in clinical 
faecium sepsis in Australia. PLoS One. 2020;15(2):e0228781. microbiology: a pathogen-oriented review. Eur J Clin Microbiol Infect 
40. Taxt AM, Avershina E, Frye SA, Naseer U, Ahmad R. Rapid Dis. 2017;36(11):2007-20. 
identification of pathogens, antibiotic resistance genes and plasmids in 56. Zheng L, Lin F, Zhu C, Liu G, Wu X, Wu Z, et al. Machine Learning 
blood cultures by nanopore sequencing. Sci Rep. 2020;10(1):7622. Algorithms Identify Pathogen-Specific Biomarkers of Clinical and 
41. Shaidullina E, Shelenkov A, Yanushevich Y, Mikhaylova Y, Shagin D, Metabolomic Characteristics in Septic Patients with Bacterial 
Alexandrova I, et al. Antimicrobial Resistance and Genomic Infections. Biomed Res Int. 2020;2020:6950576. 
Characterization of OXA-48- and CTX-M-15-Co-Producing 57. Ljungström L. Community onset sepsis in Sweden: a population based 
Hypervirulent Klebsiella pneumoniae ST23 Recovered from study Gothenburg, Sweden: Sahlgrenska Academy at University of 
Nosocomial Outbreak. Antibiotics (Basel). 2020;9(12). Gothenburg; 2017. 
42. Rumore J, Tschetter L, Kearney A, Kandar R, McCormick R, Walker 58. Ahn SH, Tsalik EL, Cyr DD, Zhang Y, van Velkinburgh JC, Langley 
M, et al. Evaluation of whole-genome sequencing for outbreak detection RJ, et al. Gene expression-based classifiers identify Staphylococcus 
of Verotoxigenic Escherichia coli O157:H7 from the Canadian aureus infection in mice and humans. PLoS One. 2013;8(1):e48979. 
perspective. BMC Genomics. 2018;19(1):870. 59. Ljungstrom L, Enroth H, Claesson BE, Ovemyr I, Karlsson J, Froberg 
43. Califf RM. Biomarker definitions and their applications. Exp Biol Med B, et al. Clinical evaluation of commercial nucleic acid amplification 
(Maywood). 2018;243(3):213-21. tests in patients with suspected sepsis. BMC Infect Dis. 2015;15:199. 
44. Pierrakos C, Velissaris D, Bisdorff M, Marshall JC, Vincent JL. 60. Andrews S. FastQC: a quality control tool for high throughput sequence 
Biomarkers of sepsis: time for a reappraisal. Crit Care. 2020;24(1):287. data. 2010. 
64 65 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
31. Raetz CR, Whitfield C. Lipopolysaccharide endotoxins. Annu Rev 45. Bloos F, Reinhart K. Rapid diagnosis of sepsis. Virulence. 
Biochem. 2002;71:635-700. 2014;5(1):154-60. 
32. Rello J, Valenzuela-Sanchez F, Ruiz-Rodriguez M, Moyano S. Sepsis: 46. Faix JD. Biomarkers of sepsis. Crit Rev Clin Lab Sci. 2013;50(1):23-
A Review of Advances in Management. Adv Ther. 2017;34(11):2393- 36. 
411. 47. Irani-Shemirani M. Biomarkers Approach in the Diagnosis and 
33. Pop-Began V, Păunescu V, Grigorean V, Pop-Began D, Popescu C. Prognosis of Sepsis. International Journal of Public Health Research. 
Molecular Mechanism in the Pathogenesis of Sepsis. J Med Life. 2022;12:1617-24. 
2014(2):4. 48. Lopez-Castejon G, Brough D. Understanding the mechanism of IL-
34. Yang L, Lin Y, Wang J, Song J, Wei B, Zhang X, et al. Comparison of 1beta secretion. Cytokine Growth Factor Rev. 2011;22(4):189-95. 
Clinical Characteristics and Outcomes Between Positive and Negative 49. Bozza FA, Salluh JI, Japiassu AM, Soares M, Assis EF, Gomes RN, et 
Blood Culture Septic Patients: A Retrospective Cohort Study. Infect al. Cytokine profiles as markers of disease severity in sepsis: a multiplex 
Drug Resist. 2021;14:4191-205. analysis. Crit Care. 2007;11(2):R49. 
35. Previsdomini M, Gini M, Cerutti B, Dolina M, Perren A. Predictors of 50. Morrow KN, Coopersmith CM, Ford ML. IL-17, IL-27, and IL-33: A 
positive blood cultures in critically ill patients: a retrospective Novel Axis Linked to Immunological Dysfunction During Sepsis. Front 
evaluation. Croat Med J. 2012;53(1):30-9. Immunol. 2019;10:1982. 
36. Luethy PM, Johnson JK. The Use of Matrix-Assisted Laser 51. Cao J, Xu F, Lin S, Song Z, Zhang L, Luo P, et al. IL-27 controls sepsis-
Desorption/Ionization Time-of-Flight Mass Spectrometry (MALDI- induced impairment of lung antibacterial host defence. Thorax. 
TOF MS) for the Identification of Pathogens Causing Sepsis. J Appl Lab 2014;69(10):926-37. 
Med. 2019;3(4):675-85. 52. Wang Y, Zhao J, Yao Y, Zhao D, Liu S. Interleukin-27 as a Diagnostic 
37. Peters RPH, van Agtmael MA, Danner SA, Savelkoul PHM, Biomarker for Patients with Sepsis: A Meta-Analysis. Biomed Res Int. 
Vandenbroucke-Grauls CMJE. New developments in the diagnosis of 2021;2021:5516940. 
bloodstream infections. The Lancet Infectious Diseases. 53. Wang JF, Yu ML, Yu G, Bian JJ, Deng XM, Wan XJ, et al. Serum miR-
2004;4(12):751-60. 146a and miR-223 as potential new biomarkers for sepsis. Biochem 
38. Opota O, Jaton K, Greub G. Microbial diagnosis of bloodstream Biophys Res Commun. 2010;394(1):184-8. 
infection: towards molecular diagnosis directly from blood. Clin 54. Shen X, Zhang J, Huang Y, Tong J, Zhang L, Zhang Z, et al. Accuracy 
Microbiol Infect. 2015;21(4):323-31. of circulating microRNAs in diagnosis of sepsis: a systematic review 
39. Lee T, Pang S, Stegger M, Sahibzada S, Abraham S, Daley D, et al. A and meta-analysis. J Intensive Care. 2020;8(1):84. 
three-year whole genome sequencing perspective of Enterococcus 55. Tagini F, Greub G. Bacterial genome sequencing in clinical 
faecium sepsis in Australia. PLoS One. 2020;15(2):e0228781. microbiology: a pathogen-oriented review. Eur J Clin Microbiol Infect 
40. Taxt AM, Avershina E, Frye SA, Naseer U, Ahmad R. Rapid Dis. 2017;36(11):2007-20. 
identification of pathogens, antibiotic resistance genes and plasmids in 56. Zheng L, Lin F, Zhu C, Liu G, Wu X, Wu Z, et al. Machine Learning 
blood cultures by nanopore sequencing. Sci Rep. 2020;10(1):7622. Algorithms Identify Pathogen-Specific Biomarkers of Clinical and 
41. Shaidullina E, Shelenkov A, Yanushevich Y, Mikhaylova Y, Shagin D, Metabolomic Characteristics in Septic Patients with Bacterial 
Alexandrova I, et al. Antimicrobial Resistance and Genomic Infections. Biomed Res Int. 2020;2020:6950576. 
Characterization of OXA-48- and CTX-M-15-Co-Producing 57. Ljungström L. Community onset sepsis in Sweden: a population based 
Hypervirulent Klebsiella pneumoniae ST23 Recovered from study Gothenburg, Sweden: Sahlgrenska Academy at University of 
Nosocomial Outbreak. Antibiotics (Basel). 2020;9(12). Gothenburg; 2017. 
42. Rumore J, Tschetter L, Kearney A, Kandar R, McCormick R, Walker 58. Ahn SH, Tsalik EL, Cyr DD, Zhang Y, van Velkinburgh JC, Langley 
M, et al. Evaluation of whole-genome sequencing for outbreak detection RJ, et al. Gene expression-based classifiers identify Staphylococcus 
of Verotoxigenic Escherichia coli O157:H7 from the Canadian aureus infection in mice and humans. PLoS One. 2013;8(1):e48979. 
perspective. BMC Genomics. 2018;19(1):870. 59. Ljungstrom L, Enroth H, Claesson BE, Ovemyr I, Karlsson J, Froberg 
43. Califf RM. Biomarker definitions and their applications. Exp Biol Med B, et al. Clinical evaluation of commercial nucleic acid amplification 
(Maywood). 2018;243(3):213-21. tests in patients with suspected sepsis. BMC Infect Dis. 2015;15:199. 
44. Pierrakos C, Velissaris D, Bisdorff M, Marshall JC, Vincent JL. 60. Andrews S. FastQC: a quality control tool for high throughput sequence 
Biomarkers of sepsis: time for a reappraisal. Crit Care. 2020;24(1):287. data. 2010. 
64 65 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
61. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for 75. Ho TK. Random decision forests. 3rd international conference on 
Illumina sequence data. Bioinformatics. 2014;30(15):2114-20. document analusis and recognition. 1995;1:278-82. 
62. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov 76. Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journal 
AS, et al. SPAdes: a new genome assembly algorithm and its of the Royal Statistical Society Series B (Methodological). 1996;58(1). 
applications to single-cell sequencing. J Comput Biol. 2012;19(5):455- 77. Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer 
77. classification using support vector machines. Machine Learning. 
63. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality 2002;46:389-422. 
assessment tool for genome assemblies. Bioinformatics. 78. Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-
2013;29(8):1072-5. Cepas J, et al. STRING v10: protein-protein interaction networks, 
64. Team. RC. R: A language and environment for statistical computing. R integrated over the tree of life. Nucleic Acids Res. 2015;43(Database 
Foundation for Statistical Computing. Vienna, Austria: R Foundation issue):D447-52. 
for Statistical Computing; 2019. 79. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. 
65. Larsen MV, Cosentino S, Lukjancenko O, Saputra D, Rasmussen S, Cytoscape: a software environment for integrated models of 
Hasman H, et al. Benchmarking of methods for genomic taxonomy. J biomolecular interaction networks. Genome Res. 2003;13(11):2498-
Clin Microbiol. 2014;52(5):1529-39. 504. 
66. Clausen P, Aarestrup FM, Lund O. Rapid and precise alignment of raw 80. Bader GD, Hogue CW. An automated method for finding molecular 
reads against redundant databases with KMA. BMC Bioinformatics. complexes in large protein interaction networks. BMC Bioinformatics. 
2018;19(1):307. 2003;4. 
67. Hasman H, Saputra D, Sicheritz-Ponten T, Lund O, Svendsen CA, 81. Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, 
Frimodt-Moller N, et al. Rapid whole-genome sequencing for detection et al. The GeneCards Suite: From Gene Data Mining to Disease Genome 
and characterization of microorganisms directly from clinical samples. Sequence Analyses. Current protocols in bioinformatics. 
J Clin Microbiol. 2014;52(1):139-46. 2016;54:1.30.1–1..3. 
68. Richter M, Rosselló-Móra R, Oliver Glöckner F, Peplies J. JSpeciesWS: 82. Safran M, Rosen N, Twik M, BarShir R, Iny Stein T, Dahary D, et al. 
a web server for prokaryotic species circumscription based on pairwise The GeneCards Suite. In: Abugessaisa, I., Kasukawa, T. (eds) Practical 
genome comparison. Bioinformatics. 2015;32(6):929-31. Guide to Life Science Databases. Singapore: Springer; 2021. p. 27-56. 
69. Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund 83. Mi H, Thomas P. PANTHER pathway: an ontology-based pathway 
O, et al. Identification of acquired antimicrobial resistance genes. J database coupled with data analysis tools. Methods Mol Biol. 
Antimicrob Chemother. 2012;67(11):2640-4. 2009;563:123-40. 
70. Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, 84. Milacic M, Beavers D, Conley P, Gong C, Gillespie M, Griss J, et al. 
et al. Multilocus sequence typing of total-genome-sequenced bacteria. J The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res. 
Clin Microbiol. 2012;50(4):1355-61. 2024;52(D1):D672-D8. 
71. Joensen KG, Scheutz F, Lund O, Hasman H, Kaas RS, Nielsen EM, et 85. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et 
al. Real-time whole-genome sequencing for routine typing, surveillance, al. Gene ontology: tool for the unification of biology. The Gene 
and outbreak detection of verotoxigenic Escherichia coli. J Clin Ontology Consortium. Nat Genet. 2000;25(1):25-9. 
Microbiol. 2014;52(5):1501-10. 86. Consortium. GO, Aleksander SA, Balhoff J, Carbon S, Cherry JM, 
72. Gordon NC, Price JR, Cole K, Everitt R, Morgan M, Finney J, et al. Drabkin HJ, et al. The Gene Ontology knowledgebase in 2023. Genetics. 
Prediction of Staphylococcus aureus antimicrobial resistance by whole- 2023;224(1). 
genome sequencing. J Clin Microbiol. 2014;52(4):1182-91. 87. Thomas PD, Ebert D, Muruganujan A, Mushayahama T, Albou LP, Mi 
73. Fredriksson S, Gullberg M, Jarvius J, Olsson C, Pietras K, Gústafsdóttir H. PANTHER: Making genome-scale phylogenetics accessible to all. 
SM, et al. Protein detection using proximity-dependent DNA ligation Protein Sci. 2022;31(1):8-22. 
assays. Nature Biotechnology. 2002;20(5):473-7. 88. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, 
74. Wei R, Wang J, Jia E, Chen T, Ni Y, Jia W. A Gibbs sampler based left- et al. Bioconductor: open software development for computational 
censored missing value imputation approach for metabolomics studies. biology and bioinformatics. Genome Biology. 2004;5(10):R80. 
Computational Biology 2018;14(1). 89. Pankla R, Buddhisa S, Berry M, Blankenship DM, Bancroft GJ, 
Banchereau J, et al. Genomic transcriptional profiling identifies a 
66 67 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
61. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for 75. Ho TK. Random decision forests. 3rd international conference on 
Illumina sequence data. Bioinformatics. 2014;30(15):2114-20. document analusis and recognition. 1995;1:278-82. 
62. Bankevich A, Nurk S, Antipov D, Gurevich AA, Dvorkin M, Kulikov 76. Tibshirani R. Regression Shrinkage and Selection via the Lasso. Journal 
AS, et al. SPAdes: a new genome assembly algorithm and its of the Royal Statistical Society Series B (Methodological). 1996;58(1). 
applications to single-cell sequencing. J Comput Biol. 2012;19(5):455- 77. Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer 
77. classification using support vector machines. Machine Learning. 
63. Gurevich A, Saveliev V, Vyahhi N, Tesler G. QUAST: quality 2002;46:389-422. 
assessment tool for genome assemblies. Bioinformatics. 78. Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-
2013;29(8):1072-5. Cepas J, et al. STRING v10: protein-protein interaction networks, 
64. Team. RC. R: A language and environment for statistical computing. R integrated over the tree of life. Nucleic Acids Res. 2015;43(Database 
Foundation for Statistical Computing. Vienna, Austria: R Foundation issue):D447-52. 
for Statistical Computing; 2019. 79. Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. 
65. Larsen MV, Cosentino S, Lukjancenko O, Saputra D, Rasmussen S, Cytoscape: a software environment for integrated models of 
Hasman H, et al. Benchmarking of methods for genomic taxonomy. J biomolecular interaction networks. Genome Res. 2003;13(11):2498-
Clin Microbiol. 2014;52(5):1529-39. 504. 
66. Clausen P, Aarestrup FM, Lund O. Rapid and precise alignment of raw 80. Bader GD, Hogue CW. An automated method for finding molecular 
reads against redundant databases with KMA. BMC Bioinformatics. complexes in large protein interaction networks. BMC Bioinformatics. 
2018;19(1):307. 2003;4. 
67. Hasman H, Saputra D, Sicheritz-Ponten T, Lund O, Svendsen CA, 81. Stelzer G, Rosen N, Plaschkes I, Zimmerman S, Twik M, Fishilevich S, 
Frimodt-Moller N, et al. Rapid whole-genome sequencing for detection et al. The GeneCards Suite: From Gene Data Mining to Disease Genome 
and characterization of microorganisms directly from clinical samples. Sequence Analyses. Current protocols in bioinformatics. 
J Clin Microbiol. 2014;52(1):139-46. 2016;54:1.30.1–1..3. 
68. Richter M, Rosselló-Móra R, Oliver Glöckner F, Peplies J. JSpeciesWS: 82. Safran M, Rosen N, Twik M, BarShir R, Iny Stein T, Dahary D, et al. 
a web server for prokaryotic species circumscription based on pairwise The GeneCards Suite. In: Abugessaisa, I., Kasukawa, T. (eds) Practical 
genome comparison. Bioinformatics. 2015;32(6):929-31. Guide to Life Science Databases. Singapore: Springer; 2021. p. 27-56. 
69. Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund 83. Mi H, Thomas P. PANTHER pathway: an ontology-based pathway 
O, et al. Identification of acquired antimicrobial resistance genes. J database coupled with data analysis tools. Methods Mol Biol. 
Antimicrob Chemother. 2012;67(11):2640-4. 2009;563:123-40. 
70. Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, 84. Milacic M, Beavers D, Conley P, Gong C, Gillespie M, Griss J, et al. 
et al. Multilocus sequence typing of total-genome-sequenced bacteria. J The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res. 
Clin Microbiol. 2012;50(4):1355-61. 2024;52(D1):D672-D8. 
71. Joensen KG, Scheutz F, Lund O, Hasman H, Kaas RS, Nielsen EM, et 85. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et 
al. Real-time whole-genome sequencing for routine typing, surveillance, al. Gene ontology: tool for the unification of biology. The Gene 
and outbreak detection of verotoxigenic Escherichia coli. J Clin Ontology Consortium. Nat Genet. 2000;25(1):25-9. 
Microbiol. 2014;52(5):1501-10. 86. Consortium. GO, Aleksander SA, Balhoff J, Carbon S, Cherry JM, 
72. Gordon NC, Price JR, Cole K, Everitt R, Morgan M, Finney J, et al. Drabkin HJ, et al. The Gene Ontology knowledgebase in 2023. Genetics. 
Prediction of Staphylococcus aureus antimicrobial resistance by whole- 2023;224(1). 
genome sequencing. J Clin Microbiol. 2014;52(4):1182-91. 87. Thomas PD, Ebert D, Muruganujan A, Mushayahama T, Albou LP, Mi 
73. Fredriksson S, Gullberg M, Jarvius J, Olsson C, Pietras K, Gústafsdóttir H. PANTHER: Making genome-scale phylogenetics accessible to all. 
SM, et al. Protein detection using proximity-dependent DNA ligation Protein Sci. 2022;31(1):8-22. 
assays. Nature Biotechnology. 2002;20(5):473-7. 88. Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, 
74. Wei R, Wang J, Jia E, Chen T, Ni Y, Jia W. A Gibbs sampler based left- et al. Bioconductor: open software development for computational 
censored missing value imputation approach for metabolomics studies. biology and bioinformatics. Genome Biology. 2004;5(10):R80. 
Computational Biology 2018;14(1). 89. Pankla R, Buddhisa S, Berry M, Blankenship DM, Bancroft GJ, 
Banchereau J, et al. Genomic transcriptional profiling identifies a 
66 67 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
candidate blood biomarker signature for the diagnosis of septicemic 102. Holden MT, Hsu LY, Kurt K, Weinert LA, Mather AE, Harris SR, et al. 
melioidosis. Genome Biol. 2009;10(11):R127. A genomic portrait of the emergence, evolution, and global spread of a 
90. Dix A, Hunniger K, Weber M, Guthke R, Kurzai O, Linde J. Biomarker- methicillin-resistant Staphylococcus aureus pandemic. Genome Res. 
based classification of bacterial and fungal whole-blood infections in a 2013;23(4):653-64. 
genome-wide expression study. Front Microbiol. 2015;6:171. 103. Becker K, Schaumburg F, Kearns A, Larsen AR, Lindsay JA, Skov RL, 
91. Kluyver T, Ragan-Kelley B, Perez F, Granger B, Bussonnier M, et al. Implications of identifying the recently defined members of the 
Frederic J, et al. Jupyter Notebooks – a publishing format for Staphylococcus aureus complex S. argenteus and S. schweitzeri: a 
reproducible computational workflows. Positioning and Power in position paper of members of the ESCMID Study Group for 
Academic Publishing: Players, Agents and Agendas. 2016:87-90. Staphylococci and Staphylococcal Diseases (ESGS). Clin Microbiol 
92. Anaconda Documentation [Internet]. Anaconda Inc. 2020. Available Infect. 2019;25(9):1064-70. 
from: https://docs.anaconda.com/. 104. Koser CU, Holden MT, Ellington MJ, Cartwright EJ, Brown NM, 
93. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Ogilvy-Stuart AL, et al. Rapid whole-genome sequencing for 
et al. Scikit-learn: Machine Learning in Python. Journal of Machine investigation of a neonatal MRSA outbreak. N Engl J Med. 
Learning Research. 2011;12:2825-30. 2012;366(24):2267-75. 
94. Le Cessie S, Houwelingen JCV. Ridge estimators in logistic regression. 105. Mason A, Foster D, Bradley P, Golubchik T, Doumith M, Gordon NC, 
Journal of the Royal Statistical Society: Series C (Applied Statistics). et al. Accuracy of Different Bioinformatics Methods in Detecting 
1992;41(1):191–201. Antibiotic Resistance and Virulence Factors from Staphylococcus 
95. Fournier P, Dubourg G, Raoult D. Clinical detection and aureus Whole-Genome Sequences. J Clin Microbiol. 2018;56(9). 
characterization of bacterial pathogens in the genomics era. Genome 106. Kane TL, Carothers KE, Lee SW. Virulence Factor Targeting of the 
Medicine. 2014;6(11). Bacterial Pathogen Staphylococcus aureus for Vaccine and 
96. Endrullat C, Glokler J, Franke P, Frohme M. Standardization and quality Therapeutics. Curr Drug Targets. 2018;19(2):111-27. 
management in next-generation sequencing. Appl Transl Genom. 107. Bukowski M, Wladyka B, Dubin G. Exfoliative toxins of 
2016;10:2-9. Staphylococcus aureus. Toxins (Basel). 2010;2(5):1148-65. 
97. Tang Hallback E, Karami N, Adlerberth I, Cardew S, Ohlen M, 108. Spaulding AR, Salgado-Pabon W, Kohler PL, Horswill AR, Leung DY, 
Engstrom Jakobsson H, et al. Methicillin-resistant Staphylococcus Schlievert PM. Staphylococcal and streptococcal superantigen 
argenteus misidentified as methicillin-resistant Staphylococcus aureus exotoxins. Clin Microbiol Rev. 2013;26(3):422-47. 
emerging in western Sweden. J Med Microbiol. 2018;67(7):968-71. 109. Shallcross LJ, Fragaszy E, Johnson AM, Hayward AC. The role of the 
98. Giske CG, Dyrkell F, Arnellos D, Vestberg N, Hermansson Panna S, Panton-Valentine leucocidin toxin in staphylococcal disease: a 
Froding I, et al. Transmission events and antimicrobial susceptibilities systematic review and meta-analysis. Lancet Infect Dis. 2013;13(1):43-
of methicillin-resistant Staphylococcus argenteus in Stockholm. Clin 54. 
Microbiol Infect. 2019;25(10):1289 e5- e8. 110. Enright M, Spratt B. Multilocus sequence typing. Trends Microbiol. 
99. Enstrom J, Froding I, Giske CG, Ininbergs K, Bai X, Sandh G, et al. 1999 7(12):482-7. 
USA300 methicillin-resistant Staphylococcus aureus in Stockholm, 111. Urwin R, Maiden MC. Multi-locus sequence typing: a tool for global 
Sweden, from 2008 to 2016. PLoS One. 2018;13(11):e0205761. epidemiology. Trends Microbiol. 2003;11(10):479-87. 
100. Saxenborn P, Baxter J, Tilevik A, Fagerlind M, Dyrkell F, Pernestig AK, 112. Struelens M. Molecular epidemiologic typing systems of bacterial 
et al. Genotypic Characterization of Clinical Klebsiella spp. Isolates pathogens: current issues and perspectives. Mem Inst Oswaldo Cruz. 
Collected From Patients With Suspected Community-Onset Sepsis, 1998;93(5):581-5. 
Sweden. Front Microbiol. 2021;12:640408. 113. Maiden M, Bygraves J, Feil E, Morelli G, Russell J, Urwin R, et al. 
101. Tong SYC, Schaumburg F, Ellington MJ, Corander J, Pichon B, Multilocus sequence typing: a portable approach to the identification of 
Leendertz F, et al. Novel staphylococcal species that form part of a clones within populations of pathogenic microorganisms. Proc Natl 
Staphylococcus aureus-related complex: the non-pigmented Acad Sci U S A 1998;95(6):3140-5. 
Staphylococcus argenteus sp. nov. and the non-human primate- 114. Page AJ, Alikhan NF, Carleton HA, Seemann T, Keane JA, Katz LS. 
associated Staphylococcus schweitzeri sp. nov. Int J Syst Evol Comparison of classical multi-locus sequence typing software for next-
Microbiol. 2015;65(Pt 1):15-22. generation sequencing data. Microb Genom. 2017;3(8):e000124. 
68 69 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
candidate blood biomarker signature for the diagnosis of septicemic 102. Holden MT, Hsu LY, Kurt K, Weinert LA, Mather AE, Harris SR, et al. 
melioidosis. Genome Biol. 2009;10(11):R127. A genomic portrait of the emergence, evolution, and global spread of a 
90. Dix A, Hunniger K, Weber M, Guthke R, Kurzai O, Linde J. Biomarker- methicillin-resistant Staphylococcus aureus pandemic. Genome Res. 
based classification of bacterial and fungal whole-blood infections in a 2013;23(4):653-64. 
genome-wide expression study. Front Microbiol. 2015;6:171. 103. Becker K, Schaumburg F, Kearns A, Larsen AR, Lindsay JA, Skov RL, 
91. Kluyver T, Ragan-Kelley B, Perez F, Granger B, Bussonnier M, et al. Implications of identifying the recently defined members of the 
Frederic J, et al. Jupyter Notebooks – a publishing format for Staphylococcus aureus complex S. argenteus and S. schweitzeri: a 
reproducible computational workflows. Positioning and Power in position paper of members of the ESCMID Study Group for 
Academic Publishing: Players, Agents and Agendas. 2016:87-90. Staphylococci and Staphylococcal Diseases (ESGS). Clin Microbiol 
92. Anaconda Documentation [Internet]. Anaconda Inc. 2020. Available Infect. 2019;25(9):1064-70. 
from: https://docs.anaconda.com/. 104. Koser CU, Holden MT, Ellington MJ, Cartwright EJ, Brown NM, 
93. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Ogilvy-Stuart AL, et al. Rapid whole-genome sequencing for 
et al. Scikit-learn: Machine Learning in Python. Journal of Machine investigation of a neonatal MRSA outbreak. N Engl J Med. 
Learning Research. 2011;12:2825-30. 2012;366(24):2267-75. 
94. Le Cessie S, Houwelingen JCV. Ridge estimators in logistic regression. 105. Mason A, Foster D, Bradley P, Golubchik T, Doumith M, Gordon NC, 
Journal of the Royal Statistical Society: Series C (Applied Statistics). et al. Accuracy of Different Bioinformatics Methods in Detecting 
1992;41(1):191–201. Antibiotic Resistance and Virulence Factors from Staphylococcus 
95. Fournier P, Dubourg G, Raoult D. Clinical detection and aureus Whole-Genome Sequences. J Clin Microbiol. 2018;56(9). 
characterization of bacterial pathogens in the genomics era. Genome 106. Kane TL, Carothers KE, Lee SW. Virulence Factor Targeting of the 
Medicine. 2014;6(11). Bacterial Pathogen Staphylococcus aureus for Vaccine and 
96. Endrullat C, Glokler J, Franke P, Frohme M. Standardization and quality Therapeutics. Curr Drug Targets. 2018;19(2):111-27. 
management in next-generation sequencing. Appl Transl Genom. 107. Bukowski M, Wladyka B, Dubin G. Exfoliative toxins of 
2016;10:2-9. Staphylococcus aureus. Toxins (Basel). 2010;2(5):1148-65. 
97. Tang Hallback E, Karami N, Adlerberth I, Cardew S, Ohlen M, 108. Spaulding AR, Salgado-Pabon W, Kohler PL, Horswill AR, Leung DY, 
Engstrom Jakobsson H, et al. Methicillin-resistant Staphylococcus Schlievert PM. Staphylococcal and streptococcal superantigen 
argenteus misidentified as methicillin-resistant Staphylococcus aureus exotoxins. Clin Microbiol Rev. 2013;26(3):422-47. 
emerging in western Sweden. J Med Microbiol. 2018;67(7):968-71. 109. Shallcross LJ, Fragaszy E, Johnson AM, Hayward AC. The role of the 
98. Giske CG, Dyrkell F, Arnellos D, Vestberg N, Hermansson Panna S, Panton-Valentine leucocidin toxin in staphylococcal disease: a 
Froding I, et al. Transmission events and antimicrobial susceptibilities systematic review and meta-analysis. Lancet Infect Dis. 2013;13(1):43-
of methicillin-resistant Staphylococcus argenteus in Stockholm. Clin 54. 
Microbiol Infect. 2019;25(10):1289 e5- e8. 110. Enright M, Spratt B. Multilocus sequence typing. Trends Microbiol. 
99. Enstrom J, Froding I, Giske CG, Ininbergs K, Bai X, Sandh G, et al. 1999 7(12):482-7. 
USA300 methicillin-resistant Staphylococcus aureus in Stockholm, 111. Urwin R, Maiden MC. Multi-locus sequence typing: a tool for global 
Sweden, from 2008 to 2016. PLoS One. 2018;13(11):e0205761. epidemiology. Trends Microbiol. 2003;11(10):479-87. 
100. Saxenborn P, Baxter J, Tilevik A, Fagerlind M, Dyrkell F, Pernestig AK, 112. Struelens M. Molecular epidemiologic typing systems of bacterial 
et al. Genotypic Characterization of Clinical Klebsiella spp. Isolates pathogens: current issues and perspectives. Mem Inst Oswaldo Cruz. 
Collected From Patients With Suspected Community-Onset Sepsis, 1998;93(5):581-5. 
Sweden. Front Microbiol. 2021;12:640408. 113. Maiden M, Bygraves J, Feil E, Morelli G, Russell J, Urwin R, et al. 
101. Tong SYC, Schaumburg F, Ellington MJ, Corander J, Pichon B, Multilocus sequence typing: a portable approach to the identification of 
Leendertz F, et al. Novel staphylococcal species that form part of a clones within populations of pathogenic microorganisms. Proc Natl 
Staphylococcus aureus-related complex: the non-pigmented Acad Sci U S A 1998;95(6):3140-5. 
Staphylococcus argenteus sp. nov. and the non-human primate- 114. Page AJ, Alikhan NF, Carleton HA, Seemann T, Keane JA, Katz LS. 
associated Staphylococcus schweitzeri sp. nov. Int J Syst Evol Comparison of classical multi-locus sequence typing software for next-
Microbiol. 2015;65(Pt 1):15-22. generation sequencing data. Microb Genom. 2017;3(8):e000124. 
68 69 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
115. Deurenberg RH, Stobberingh EE. The evolution of Staphylococcus 128. Karlsson H, Larsson P, Wold AE, Rudin A. Pattern of cytokine 
aureus. Infect Genet Evol. 2008;8(6):747-63. responses to gram-positive and gram-negative commensal bacteria is 
116. Swedres-Svarm. Sales of antibiotics and occurrence of resistance in profoundly changed when monocytes differentiate into dendritic cells. 
Sweden. Solna/Uppsala ISSN1650-6332 Sweden; 2021. Infect Immun. 2004;72(5):2671-8. 
117. Gideskog M, Melhus A. Outbreak of Methicillin-resistant 129. Cross ML, Ganner A, Teilab D, Fray LM. Patterns of cytokine induction 
Staphylococcus aureus in a Hospital Center for Children's and Women's by gram-positive and gram-negative probiotic bacteria. FEMS Immunol 
Health in a Swedish County. APMIS. 2019;127(4):181-6. Med Microbiol. 2004;42(2):173-80. 
118. Japoni A, Ziyaeyan M, Jmalidoust M, Farshad S, Alborzi A, Rafaatpour 130. Surbatovic M, Popovic N, Vojvodic D, Milosevic I, Acimovic G, 
N, et al. Antibacterial susceptibility patterns and cross-resistance of Stojicic M, et al. Cytokine profile in severe Gram-positive and Gram-
methicillin resistant and sensitive staphyloccus aureus isolated from the negative abdominal sepsis. Sci Rep. 2015;5:11355. 
hospitalized patients in shiraz, iran. Braz J Microbiol. 2010 41(3):567- 131. Skovbjerg S, Martner A, Hynsjo L, Hessle C, Olsen I, Dewhirst FE, et 
73. al. Gram-positive and gram-negative bacteria induce different patterns 
119. Rehman LU, Afzal Khan A, Afridi P, Ur Rehman S, Wajahat M, Khan of cytokine production in human mononuclear cells irrespective of 
F. Prevalence and antibiotic susceptibility of clinical staphylococcus taxonomic relatedness. J Interferon Cytokine Res. 2010;30(1):23-32. 
aureus isolates in various specimens collected from a tertiary care 132. Arabestani MR, Rastiany S, Kazemi S, Mousavi SM. Conventional, 
hospital, Hayatabad, Peshawar, Pakistan. Pakistan Journal of Health molecular methods and biomarkers molecules in detection of 
Sciences. 2022;04(3):105-10. septicemia. Adv Biomed Res. 2015;4:120. 
120. Mesbah A, Mashak Z, Abdolmaleki Z. A survey of prevalence and 133. Petrera A, von Toerne C, Behler J, Huth C, Thorand B, Hilgendorff A, 
phenotypic and genotypic assessment of antibiotic resistance in et al. Multi-platforms approach for plasma proteomics: complementarity 
Staphylococcus aureus bacteria isolated from ready-to-eat food samples of Olink PEA technology to mass spectrometry-based protein profiling. 
collected from Tehran Province, Iran. Trop Med Health. 2021;49(1):81. Journal of Proteome Research. 2021;20:751-62. 
121. Kayili E, Sanlibaba P. Prevalence, characterization and antibiotic 134. Lenz M, Schulz A, Koeck T, Rapp S, Nagler M, Sauer M, et al. Missing 
resistance of Staphylococcus aureus isolated from traditional cheeses in value imputation in proximity extension assay-based targeted 
Turkey. International Journal of Food Properties. 2020;23(1):1441-51. proteomics data. PLoS One. 2020;15(12):e0243487. 
122. Castleman MJ, Pokhrel S, Triplett KD, Kusewitt DF, Elmore BO, 135. Wang K, Bhandari V, Giuliano JS, Jr., CS OH, Shattuck MD, Kirby M. 
Joyner JA, et al. Innate Sex Bias of Staphylococcus aureus Skin Angiopoietin-1, angiopoietin-2 and bicarbonate as diagnostic 
Infection Is Driven by alpha-Hemolysin. J Immunol. 2018;200(2):657- biomarkers in children with severe sepsis. PLoS One. 
68. 2014;9(9):e108461. 
123. Kupfer M, Jatzwauk I, Monecke S, Möbius J, Weusten A. MRSA in a 136. Ratzinger F, Haslacher H, Perkmann T, Pinzan M, Anner P, 
large German University Hospital: Male gender is a significant risk Makristathis A, et al. Machine learning for fast identification of 
factor for MRSA acquisition. GMS Krankenhhyg Interdiszip. 2010;5(2. bacteraemia in SIRS patients treated on standard care wards: a cohort 
124. Thorlacius-Ussing L, Sandholdt H, Larsen AR, Petersen A, Benfield T. study. Sci Rep. 2018;8(1):12233. 
Age-Dependent Increase in Incidence of Staphylococcus aureus 137. Koga T, Sumiyoshi R, Furukawa K, Sato S, Migita K, Shimizu T, et al. 
Bacteremia, Denmark, 2008-2015. Emerg Infect Dis. 2019;25(5):875- Interleukin-18 and fibroblast growth factor 2 in combination is a useful 
82. diagnostic biomarker to distinguish adult-onset Still's disease from 
125. Skogberg K, Lyytikainen O, Ollgren J, Nuorti JP, Ruutu P. Population- sepsis. Arthritis Res Ther. 2020;22(1):108. 
based burden of bloodstream infections in Finland. Clin Microbiol 138. Fan Y, Han Q, Li J, Ye G, Zhang X, Xu T, et al. Revealing potential 
Infect. 2012;18(6):E170-6. diagnostic gene biomarkers of septic shock based on machine learning 
126. Yang ES, Tan J, Eells S, Rieg G, Tagudar G, Miller LG. Body site analysis. BMC Infect Dis. 2022;22(1):65. 
colonization in patients with community-associated methicillin-resistant 139. Lien F, Lin HS, Wu YT, Chiueh TS. Bacteremia detection from 
Staphylococcus aureus and other types of S. aureus skin infections. Clin complete blood count and differential leukocyte count with machine 
Microbiol Infect. 2010;16(5):425-31. learning: complementary and competitive with C-reactive protein and 
127. Sakr A, Bregeon F, Mege JL, Rolain JM, Blin O. Staphylococcus aureus procalcitonin tests. BMC Infect Dis. 2022;22(1):287. 
Nasal Colonization: An Update on Mechanisms, Epidemiology, Risk 140. Ming T, Dong M, Song X, Li X, Kong Q, Fang Q, et al. Integrated 
Factors, and Subsequent Infections. Front Microbiol. 2018;9:2419. Analysis of Gene Co-Expression Network and Prediction Model 
70 71 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
115. Deurenberg RH, Stobberingh EE. The evolution of Staphylococcus 128. Karlsson H, Larsson P, Wold AE, Rudin A. Pattern of cytokine 
aureus. Infect Genet Evol. 2008;8(6):747-63. responses to gram-positive and gram-negative commensal bacteria is 
116. Swedres-Svarm. Sales of antibiotics and occurrence of resistance in profoundly changed when monocytes differentiate into dendritic cells. 
Sweden. Solna/Uppsala ISSN1650-6332 Sweden; 2021. Infect Immun. 2004;72(5):2671-8. 
117. Gideskog M, Melhus A. Outbreak of Methicillin-resistant 129. Cross ML, Ganner A, Teilab D, Fray LM. Patterns of cytokine induction 
Staphylococcus aureus in a Hospital Center for Children's and Women's by gram-positive and gram-negative probiotic bacteria. FEMS Immunol 
Health in a Swedish County. APMIS. 2019;127(4):181-6. Med Microbiol. 2004;42(2):173-80. 
118. Japoni A, Ziyaeyan M, Jmalidoust M, Farshad S, Alborzi A, Rafaatpour 130. Surbatovic M, Popovic N, Vojvodic D, Milosevic I, Acimovic G, 
N, et al. Antibacterial susceptibility patterns and cross-resistance of Stojicic M, et al. Cytokine profile in severe Gram-positive and Gram-
methicillin resistant and sensitive staphyloccus aureus isolated from the negative abdominal sepsis. Sci Rep. 2015;5:11355. 
hospitalized patients in shiraz, iran. Braz J Microbiol. 2010 41(3):567- 131. Skovbjerg S, Martner A, Hynsjo L, Hessle C, Olsen I, Dewhirst FE, et 
73. al. Gram-positive and gram-negative bacteria induce different patterns 
119. Rehman LU, Afzal Khan A, Afridi P, Ur Rehman S, Wajahat M, Khan of cytokine production in human mononuclear cells irrespective of 
F. Prevalence and antibiotic susceptibility of clinical staphylococcus taxonomic relatedness. J Interferon Cytokine Res. 2010;30(1):23-32. 
aureus isolates in various specimens collected from a tertiary care 132. Arabestani MR, Rastiany S, Kazemi S, Mousavi SM. Conventional, 
hospital, Hayatabad, Peshawar, Pakistan. Pakistan Journal of Health molecular methods and biomarkers molecules in detection of 
Sciences. 2022;04(3):105-10. septicemia. Adv Biomed Res. 2015;4:120. 
120. Mesbah A, Mashak Z, Abdolmaleki Z. A survey of prevalence and 133. Petrera A, von Toerne C, Behler J, Huth C, Thorand B, Hilgendorff A, 
phenotypic and genotypic assessment of antibiotic resistance in et al. Multi-platforms approach for plasma proteomics: complementarity 
Staphylococcus aureus bacteria isolated from ready-to-eat food samples of Olink PEA technology to mass spectrometry-based protein profiling. 
collected from Tehran Province, Iran. Trop Med Health. 2021;49(1):81. Journal of Proteome Research. 2021;20:751-62. 
121. Kayili E, Sanlibaba P. Prevalence, characterization and antibiotic 134. Lenz M, Schulz A, Koeck T, Rapp S, Nagler M, Sauer M, et al. Missing 
resistance of Staphylococcus aureus isolated from traditional cheeses in value imputation in proximity extension assay-based targeted 
Turkey. International Journal of Food Properties. 2020;23(1):1441-51. proteomics data. PLoS One. 2020;15(12):e0243487. 
122. Castleman MJ, Pokhrel S, Triplett KD, Kusewitt DF, Elmore BO, 135. Wang K, Bhandari V, Giuliano JS, Jr., CS OH, Shattuck MD, Kirby M. 
Joyner JA, et al. Innate Sex Bias of Staphylococcus aureus Skin Angiopoietin-1, angiopoietin-2 and bicarbonate as diagnostic 
Infection Is Driven by alpha-Hemolysin. J Immunol. 2018;200(2):657- biomarkers in children with severe sepsis. PLoS One. 
68. 2014;9(9):e108461. 
123. Kupfer M, Jatzwauk I, Monecke S, Möbius J, Weusten A. MRSA in a 136. Ratzinger F, Haslacher H, Perkmann T, Pinzan M, Anner P, 
large German University Hospital: Male gender is a significant risk Makristathis A, et al. Machine learning for fast identification of 
factor for MRSA acquisition. GMS Krankenhhyg Interdiszip. 2010;5(2. bacteraemia in SIRS patients treated on standard care wards: a cohort 
124. Thorlacius-Ussing L, Sandholdt H, Larsen AR, Petersen A, Benfield T. study. Sci Rep. 2018;8(1):12233. 
Age-Dependent Increase in Incidence of Staphylococcus aureus 137. Koga T, Sumiyoshi R, Furukawa K, Sato S, Migita K, Shimizu T, et al. 
Bacteremia, Denmark, 2008-2015. Emerg Infect Dis. 2019;25(5):875- Interleukin-18 and fibroblast growth factor 2 in combination is a useful 
82. diagnostic biomarker to distinguish adult-onset Still's disease from 
125. Skogberg K, Lyytikainen O, Ollgren J, Nuorti JP, Ruutu P. Population- sepsis. Arthritis Res Ther. 2020;22(1):108. 
based burden of bloodstream infections in Finland. Clin Microbiol 138. Fan Y, Han Q, Li J, Ye G, Zhang X, Xu T, et al. Revealing potential 
Infect. 2012;18(6):E170-6. diagnostic gene biomarkers of septic shock based on machine learning 
126. Yang ES, Tan J, Eells S, Rieg G, Tagudar G, Miller LG. Body site analysis. BMC Infect Dis. 2022;22(1):65. 
colonization in patients with community-associated methicillin-resistant 139. Lien F, Lin HS, Wu YT, Chiueh TS. Bacteremia detection from 
Staphylococcus aureus and other types of S. aureus skin infections. Clin complete blood count and differential leukocyte count with machine 
Microbiol Infect. 2010;16(5):425-31. learning: complementary and competitive with C-reactive protein and 
127. Sakr A, Bregeon F, Mege JL, Rolain JM, Blin O. Staphylococcus aureus procalcitonin tests. BMC Infect Dis. 2022;22(1):287. 
Nasal Colonization: An Update on Mechanisms, Epidemiology, Risk 140. Ming T, Dong M, Song X, Li X, Kong Q, Fang Q, et al. Integrated 
Factors, and Subsequent Infections. Front Microbiol. 2018;9:2419. Analysis of Gene Co-Expression Network and Prediction Model 
70 71 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
Indicates Immune-Related Roles of the Identified Biomarkers in Sepsis 154. Eisen DP, Dean MM, Boermeester MA, Fidler KJ, Gordon AC, 
and Sepsis-Induced Acute Respiratory Distress Syndrome. Front Kronborg G, et al. Low serum mannose-binding lectin level increases 
Immunol. 2022;13:897390. the risk of death due to pneumococcal infection. Clin Infect Dis. 
141. Mikacenic C, Price BL, Harju-Baker S, O'Mahony DS, Robinson-Cohen 2008;47(4):510-6. 
C, Radella F, et al. A Two-Biomarker Model Predicts Mortality in the 155. Jacobson S, Larsson P, Aberg AM, Johansson G, Winso O, Soderberg 
Critically Ill with Sepsis. Am J Respir Crit Care Med. S. Levels of mannose-binding lectin (MBL) associates with sepsis-
2017;196(8):1004-11. related in-hospital mortality in women. J Inflamm (Lond). 2020;17:28. 
142. Li J, Zhou M, Feng JQ, Hong SM, Yang SY, Zhi LX, et al. Bulk RNA 156. Liu L, Ning B. The role of MBL2 gene polymorphism in sepsis 
Sequencing With Integrated Single-Cell RNA Sequencing Identifies incidence. In J Clin Exp Pathol. 2015;8(11):15123-7. 
BCL2A1 as a Potential Diagnostic and Prognostic Biomarker for Sepsis. 157. Li Y, Lyu C-Z, Cheng S-W, Xian L-N, Lin Z-X, Ao X, et al. The clinical 
Front Public Health. 2022;10:937303. relevance of MBL2 gene polymorphism and sepsis. Asian Pacific 
143. She H, Tan L, Zhou Y, Zhu Y, Ma C, Wu Y, et al. The Landscape of Journal of Tropical Medicine. 2018;11(3). 
Featured Metabolism-Related Genes and Imbalanced Immune Cell 158. Morgan BJ, Bauza-Mayol G, Gardner OFW, Zhang Y, Levato R, Archer 
Subsets in Sepsis. Front Genet. 2022;13:821275. CW, et al. Bone Morphogenetic Protein-9 Is a Potent Chondrogenic and 
144. Yao Y, Zhao J, Hu J, Song H, Wang S, Wang Y. Identification of a Four- Morphogenic Factor for Articular Cartilage Chondroprogenitors. Stem 
Gene Signature for Diagnosing Paediatric Sepsis. Biomed Res Int. Cells Dev. 2020;29(14):882-94. 
2022;2022:5217885. 159. Mostafa S, Pakvasa M, Coalson E, Zhu A, Alverdy A, Castillo H, et al. 
145. Fang Q, Wang Q, Zhou Z, Xie A. Consensus analysis via weighted gene The wonders of BMP9: From mesenchymal stem cell differentiation, 
co-expression network analysis (WGCNA) reveals genes participating angiogenesis, neurogenesis, tumorigenesis, and metabolism to 
in early phase of acute respiratory distress syndrome (ARDS) induced regenerative medicine. Genes Dis. 2019;6(3):201-23. 
by sepsis. Bioengineered. 2021;12(1):1161-72. 160. Faiotto VB, Franci D, Enz Hubert RM, de Souza GR, Fiusa MML, 
146. Bandyopadhyay S, Lysak N, Adhikari L, Velez LM, Sautina L, Hounkpe BW, et al. Circulating levels of the angiogenesis mediators 
Mohandas R, et al. Discovery and Validation of Urinary Molecular endoglin, HB-EGF, BMP-9 and FGF-2 in patients with severe sepsis 
Signature of Early Sepsis. Crit Care Explor. 2020;2(10):e0195. and septic shock. J Crit Care. 2017;42:162-7. 
147. Olink. How is the Limit of Detection (LOD) estimated and how is this 161. He H, Garcia EA. Learning from Imbalanced Data. IEEE Transactions 
handled in the data analysis? 2018 [updated 2018 Dec 18; cited 2022 on Knowledge and Data Engineering. 2009;21:1263-84. 
Nov 08]. 162. Chen H, Li Y, Li T, Sun H, Tan C, Gao M, et al. Identification of 
148. Zhang J, Friberg IM, Kift-Morgan A, Parekh G, Morgan MP, Liuzzi AR, Potential Transcriptional Biomarkers Differently Expressed in Both S. 
et al. Machine-learning algorithms define pathogen-specific local aureus- and E. coli-Induced Sepsis via Integrated Analysis. Biomed Res 
immune fingerprints in peritoneal dialysis patients with bacterial Int. 2019;2019:2487921. 
infections. Kidney Int. 2017;92(1):179-91. 163. Mitchell A, Rentero C, Endoh Y, Hsu K, Gaus K, Geczy C, et al. 
149. Hair J, J F, Black JW, Babin BJ, Anderson ER. Multivariate Data LILRA5 is expressed by synovial tissue macrophages in rheumatoid 
Analysis. Seventh ed: Edinburgh: Pearson Education Limited; 2010. arthritis, selectively induces pro-inflammatory cytokines and IL-10 and 
150. Byrne BM. Structural equation modeling with AMOS: Basic concepts, is regulated by TNF-alpha, IL-10 and IFN-gamma. Eur J Immunol. 
applications, and programming: New York: Routledge; 2010. 2008;38(12):3459-73. 
151. Yu MH, Chen MH, Han F, Li Q, Sun RH, Tu YX. Prognostic value of 164. Abdallah F, Coindre S, Gardet M, Meurisse F, Naji A, Suganuma N, et 
the biomarkers serum amyloid A and nitric oxide in patients with sepsis. al. Leukocyte Immunoglobulin-Like Receptors in Regulating the 
Int Immunopharmacol. 2018;62:287-92. Immune Response in Infectious Diseases: A Window of Opportunity to 
152. Garay-Baquero DJ, White CH, Walker NF, Tebruegge M, Schiff HF, Pathogen Persistence and a Sound Target in Therapeutics. Front 
Ugarte-Gil C, et al. Comprehensive plasma proteomic profiling reveals Immunol. 2021;12:717998. 
biomarkers for active tuberculosis. JCI Insight. 2020;5(18). 165. Lewis Marffy AL, McCarthy AJ. Leukocyte Immunoglobulin-Like 
153. Wozniak JM, Mills RH, Olson J, Caldera JR, Sepich-Poore GD, Receptors (LILRs) on Human Neutrophils: Modulators of Infection and 
Carrillo-Terrazas M, et al. Mortality Risk Profiling of Staphylococcus Immunity. Front Immunol. 2020;11:857. 
aureus Bacteremia by Multi-omic Serum Analysis Reveals Early 166. Evrard C, Faway E, De Vuyst E, Svensek O, De Glas V, Bergerat D, et 
Predictive and Pathogenic Signatures. Cell. 2020;182(5):1311-27 e14. al. Deletion of TNFAIP6 Gene in Human Keratinocytes Demonstrates a 
72 73 
  
Biomarker profiling in sepsis diagnostics Mahnaz Irani Shemirani 
Indicates Immune-Related Roles of the Identified Biomarkers in Sepsis 154. Eisen DP, Dean MM, Boermeester MA, Fidler KJ, Gordon AC, 
and Sepsis-Induced Acute Respiratory Distress Syndrome. Front Kronborg G, et al. Low serum mannose-binding lectin level increases 
Immunol. 2022;13:897390. the risk of death due to pneumococcal infection. Clin Infect Dis. 
141. Mikacenic C, Price BL, Harju-Baker S, O'Mahony DS, Robinson-Cohen 2008;47(4):510-6. 
C, Radella F, et al. A Two-Biomarker Model Predicts Mortality in the 155. Jacobson S, Larsson P, Aberg AM, Johansson G, Winso O, Soderberg 
Critically Ill with Sepsis. Am J Respir Crit Care Med. S. Levels of mannose-binding lectin (MBL) associates with sepsis-
2017;196(8):1004-11. related in-hospital mortality in women. J Inflamm (Lond). 2020;17:28. 
142. Li J, Zhou M, Feng JQ, Hong SM, Yang SY, Zhi LX, et al. Bulk RNA 156. Liu L, Ning B. The role of MBL2 gene polymorphism in sepsis 
Sequencing With Integrated Single-Cell RNA Sequencing Identifies incidence. In J Clin Exp Pathol. 2015;8(11):15123-7. 
BCL2A1 as a Potential Diagnostic and Prognostic Biomarker for Sepsis. 157. Li Y, Lyu C-Z, Cheng S-W, Xian L-N, Lin Z-X, Ao X, et al. The clinical 
Front Public Health. 2022;10:937303. relevance of MBL2 gene polymorphism and sepsis. Asian Pacific 
143. She H, Tan L, Zhou Y, Zhu Y, Ma C, Wu Y, et al. The Landscape of Journal of Tropical Medicine. 2018;11(3). 
Featured Metabolism-Related Genes and Imbalanced Immune Cell 158. Morgan BJ, Bauza-Mayol G, Gardner OFW, Zhang Y, Levato R, Archer 
Subsets in Sepsis. Front Genet. 2022;13:821275. CW, et al. Bone Morphogenetic Protein-9 Is a Potent Chondrogenic and 
144. Yao Y, Zhao J, Hu J, Song H, Wang S, Wang Y. Identification of a Four- Morphogenic Factor for Articular Cartilage Chondroprogenitors. Stem 
Gene Signature for Diagnosing Paediatric Sepsis. Biomed Res Int. Cells Dev. 2020;29(14):882-94. 
2022;2022:5217885. 159. Mostafa S, Pakvasa M, Coalson E, Zhu A, Alverdy A, Castillo H, et al. 
145. Fang Q, Wang Q, Zhou Z, Xie A. Consensus analysis via weighted gene The wonders of BMP9: From mesenchymal stem cell differentiation, 
co-expression network analysis (WGCNA) reveals genes participating angiogenesis, neurogenesis, tumorigenesis, and metabolism to 
in early phase of acute respiratory distress syndrome (ARDS) induced regenerative medicine. Genes Dis. 2019;6(3):201-23. 
by sepsis. Bioengineered. 2021;12(1):1161-72. 160. Faiotto VB, Franci D, Enz Hubert RM, de Souza GR, Fiusa MML, 
146. Bandyopadhyay S, Lysak N, Adhikari L, Velez LM, Sautina L, Hounkpe BW, et al. Circulating levels of the angiogenesis mediators 
Mohandas R, et al. Discovery and Validation of Urinary Molecular endoglin, HB-EGF, BMP-9 and FGF-2 in patients with severe sepsis 
Signature of Early Sepsis. Crit Care Explor. 2020;2(10):e0195. and septic shock. J Crit Care. 2017;42:162-7. 
147. Olink. How is the Limit of Detection (LOD) estimated and how is this 161. He H, Garcia EA. Learning from Imbalanced Data. IEEE Transactions 
handled in the data analysis? 2018 [updated 2018 Dec 18; cited 2022 on Knowledge and Data Engineering. 2009;21:1263-84. 
Nov 08]. 162. Chen H, Li Y, Li T, Sun H, Tan C, Gao M, et al. Identification of 
148. Zhang J, Friberg IM, Kift-Morgan A, Parekh G, Morgan MP, Liuzzi AR, Potential Transcriptional Biomarkers Differently Expressed in Both S. 
et al. Machine-learning algorithms define pathogen-specific local aureus- and E. coli-Induced Sepsis via Integrated Analysis. Biomed Res 
immune fingerprints in peritoneal dialysis patients with bacterial Int. 2019;2019:2487921. 
infections. Kidney Int. 2017;92(1):179-91. 163. Mitchell A, Rentero C, Endoh Y, Hsu K, Gaus K, Geczy C, et al. 
149. Hair J, J F, Black JW, Babin BJ, Anderson ER. Multivariate Data LILRA5 is expressed by synovial tissue macrophages in rheumatoid 
Analysis. Seventh ed: Edinburgh: Pearson Education Limited; 2010. arthritis, selectively induces pro-inflammatory cytokines and IL-10 and 
150. Byrne BM. Structural equation modeling with AMOS: Basic concepts, is regulated by TNF-alpha, IL-10 and IFN-gamma. Eur J Immunol. 
applications, and programming: New York: Routledge; 2010. 2008;38(12):3459-73. 
151. Yu MH, Chen MH, Han F, Li Q, Sun RH, Tu YX. Prognostic value of 164. Abdallah F, Coindre S, Gardet M, Meurisse F, Naji A, Suganuma N, et 
the biomarkers serum amyloid A and nitric oxide in patients with sepsis. al. Leukocyte Immunoglobulin-Like Receptors in Regulating the 
Int Immunopharmacol. 2018;62:287-92. Immune Response in Infectious Diseases: A Window of Opportunity to 
152. Garay-Baquero DJ, White CH, Walker NF, Tebruegge M, Schiff HF, Pathogen Persistence and a Sound Target in Therapeutics. Front 
Ugarte-Gil C, et al. Comprehensive plasma proteomic profiling reveals Immunol. 2021;12:717998. 
biomarkers for active tuberculosis. JCI Insight. 2020;5(18). 165. Lewis Marffy AL, McCarthy AJ. Leukocyte Immunoglobulin-Like 
153. Wozniak JM, Mills RH, Olson J, Caldera JR, Sepich-Poore GD, Receptors (LILRs) on Human Neutrophils: Modulators of Infection and 
Carrillo-Terrazas M, et al. Mortality Risk Profiling of Staphylococcus Immunity. Front Immunol. 2020;11:857. 
aureus Bacteremia by Multi-omic Serum Analysis Reveals Early 166. Evrard C, Faway E, De Vuyst E, Svensek O, De Glas V, Bergerat D, et 
Predictive and Pathogenic Signatures. Cell. 2020;182(5):1311-27 e14. al. Deletion of TNFAIP6 Gene in Human Keratinocytes Demonstrates a 
72 73 
  
Biomarker profiling in sepsis diagnostics 
Role for TSG-6 to Retain Hyaluronan Inside Epidermis. JID Innov. 
2021;1(4):100054. 
  
74