Abstract

Molecular classification of cancer using multi-omics data is central to precision oncology, enabling identification of distinct subtypes based on gene expression, mutations, and methylation—beyond traditional histology. While machine learning methods like Support Vector Machines and Random Forests have shown success, they struggle with high-dimensionality, multi-omics integration, and limited generalizability. Deep learning offers promise but faces challenges such as overfitting on small datasets, lack of interpretability, and poor cross-cohort performance.

We present TabPFN, a Prior-data Fitted Network designed for tabular data, as a new foundation model for cancer classification. Unlike traditional models requiring dataset-specific training, TabPFN uses in-context learning via pretraining on synthetic data. This eliminates hyperparameter tuning, allows fast inference, performs well on small datasets, and includes built-in uncertainty estimation—making it ideal for clinical use.

Applied to multiple cohorts of gene expression data of bladder cancer, including The Cancer Genome Atlas (TCGA) RNA-seq data, TabPFN achieves competitive or superior performance with significantly reduced computational time. Our results show robust classification of tumor subtypes using gene expression alone.

Foundation models like TabPFN represent a paradigm shift in computational oncology, addressing long-standing barriers to clinical translation. We conclude by outlining future directions, including multi-omics integration, automated feature selection, and cancer-specific foundation models.

Speaker Bio

Dr. Seungchan Kim is a Chief Scientist and Executive Professor at the Department of Electrical and Computer Engineering and the Director of the CRI Center for Computational Systems Biology at the Prairie View A&M University (PVAMU). Prior to this appointment, he was the Head of Biocomputing Unit and an Associate Professor at Integrated Cancer Genomics Division of Translational Genomics Research Institute (TGen). He was one of the founding faculty members of TGen, founded in 2002, by Dr. Trent, then-Scientific Director of the National Human Genome Research Institute at the National Institutes of Health, leading computational systems biology research at the institute. He was also an Assistant Professor in the School of Computing, Informatics, Decision Systems Engineering (CIDSE) at the Arizona State University from 2004 till 2011. Dr. Kim received B.S. and M.S. degrees in Agriculture Engineering from the Seoul National University, and Ph.D. in Electrical Engineering from the Texas A&M University. He also got his post-doctoral training at the Cancer Genetics Branch of National Human Genome Research Institute.

Dr. Kim’s research interests include: 1) mathematical modeling of genetic regulatory networks, 2) development of computational methods to analyze multitude of high throughput multi-omics data to identify disease biomarkers, and 3) computational models to diagnose patients or predict patient outcomes, for example, disease subtypes or drug response. His studies have had a large influence on the development of computational tools to study underlying mechanisms for cancer development and better understand the molecular mechanisms behind cancer biology and biological systems.

Abstract

Mass-shooting events pose a significant challenge to public safety, generating large volumes of unstructured textual data that hinder effective investigations and the formulation of public policy. Despite the urgency, few prior studies have effectively automated the extraction of key information from these events to support legal and investigative efforts. This paper presented the first dataset designed for knowledge acquisition on mass-shooting events through the application of named entity recognition (NER) techniques. It focuses on identifying key entities such as offenders, victims, locations, and criminal instruments, that are vital for legal and investigative purposes. The NER process is powered by Large Language Models (LLMs) using few-shot prompting, facilitating the efficient extraction and organization of critical information from diverse sources, including news articles, police reports, and social media. Experimental results on real-world mass-shooting corpora demonstrate that GPT-4o is the most effective model for mass-shooting NER, achieving the highest Micro Precision, Micro Recall, and Micro F1-scores. Meanwhile, o1-mini delivers competitive performance, making it a resource-efficient alternative for less complex NER tasks. It is also observed that increasing the shot count enhances the performance of all models, but the gains are more substantial for GPT-4o and o1-mini, highlighting their superior adaptability to few-shot learning scenarios.

Speaker Bio

Dr. Xishuang Dong is a member of CRI Center for Computational Systems Biology and CREDIT, and Associate Professor at Department of Electrical and Computer Engineering at Prairie View A&M University (PVAMU). His research interests include: (1) machine learning based computational systems biology; (2) biomedical information processing; (3) deep learning for big data analysis; (4) natural language processing.

Abstract

Clinical care is inherently multimodal, with medical image data collected throughout the patient’s journey. For example, a patient at risk of cancer will undergo an ultrasound-guided biopsy, and when available with MRI revealing regions to be targeted due to higher risk to harbor aggressive disease. This biopsy procedure seeks to collect tissue samples for pathology and will inform treatment strategies for best outcomes. This common scenario provides unique opportunities for Artificial Intelligence (AI) methods to effectively integrate multimodal data, and learn imaging signatures in patients with known outcomes, to enable early cancer detection for patients at risk. My research focuses on developing AI methods that bridge the gap between highly informative modalities, e.g., pathology or MRI, and lower resolution modalities, e.g., ultrasound. These methods rely on multimodal image registration, image feature fusion, or integration of patient-specific data and population-specific information and rely on AI approaches for effective integration. While the learning is done with multiple imaging modalities, the inference requires only the low-resolution modality, e.g., ubiquitous conventional ultrasound, with applications in low-resource settings. These methods are applied to detect cancer and its aggressive extent in various cancers, e.g. prostate, kidney, or breast.

Speaker Bio

Dr. Rusu is an Assistant Professor, in the Department of Radiology, and, by courtesy, Department of Urology and Biomedical Data Science, at Stanford University, where she leads the Personalized Integrative Medicine Laboratory (PIMed). The PIMed Laboratory has a multi-disciplinary direction and focuses on developing analytic methods for biomedical data integration, with a particular interest in multimodal fusion, e.g., radiology-pathology fusion to facilitate radiology image labeling, or MRI-ultrasound for guiding procedure. These fusion approaches allow the downstream training of advanced multimodal machine learning for cancer detection and subtype identification at pixel-level. Our approaches have been applied in oncologic (prostate, breast, kidney) and non-oncologic applications.

Dr. Rusu received a Master of Engineering in Bioinformatics from the National Institute of Applied Sciences in Lyon, France. She continued her training at the University of Texas Health Science Center in Houston, where she received a Master of Science and PhD degree in Health Informatics for her work in biomolecular structural data integration of cryo-electron micrographs and X-ray crystallography models.

During her postdoctoral training at Rutgers and Case Western Reserve University, Dr. Rusu has developed computational tools for the integration and interpretation of multi-modal medical imaging data and focused on studying prostate and lung cancers. Prior to joining Stanford, Dr. Rusu was a Lead Engineer and Medical Image Analysis Scientist at GE Global Research Niskayuna NY where she was involved in the development of analytic methods to characterize biological samples in microscopy images and pathologic conditions in MRI or CT.

Abstract

Most cancers are epithelial in origin and can invade other tissues, and we seek to understand: 1) How germline and somatic mutations may contribute to pathogenesis and perturbed immunity in reproductive cancers. Prostate cancer is the second-leading cause of death among cancer cases in US men. The American Cancer Society estimates that there will be over 190,000 new cases of prostate cancer in the US in 2020. It is estimated that one out of 9 men will receive a prostate cancer diagnosis in their lifetime. A study examining molecular mechanisms identified 362 differentially expressed genes in signaling pathways regulating tumor aggressiveness. The goal of our current projects is to determine the role of variants of undetermined significance on prostate or cervical cancer cell gene expression, apoptosis and proliferation. We will use the All of Us Research Hub to identify variants. Although known variants of pathogenicity have been linked to disease for prominent DNA repair genes BRCA1 and BRCA2, certain rare variants have not been elucidated for functionality. The objectives are to express clinical variants of BRCA2 in prostate cancer cells and examine gene expression, homologous recombination, apoptosis, and cell proliferation. In a future direction, we aim to determine the responsiveness of cancer cells with variant mutations to current targeted therapies.

Speaker Bio

Dr. Victoria Mgbemena is an Assistant Professor in the Department of Biology, Marvin D. and June Samuel Brailsford College of Arts and Sciences, Prairie View A&M University. She received a Ph.D. from the University of Texas Health Science Center at San Antonio, where she studied Host-Pathogen Interactions. She completed her Postdoctoral studies in Hematology/Oncology at UT Southwestern Medical Center, where she studied the role of a DNA repair gene in hematopoiesis.

Dr. Mgbemena’s research topics interests include: Inheritance of rare diseases, Reproductive Health, Applications for developing Personalized Medical Approaches, Healthcare Gaps, and Preventative Care. She is interested in the following mechanisms: Cell-Cell Communications, modulation of cell metastatic potential by DNA repair pathways

Abstract

As microgrids and distributed energy resources (DERs) become critical components of modern energy infrastructures, their increasing digitalization exposes them to sophisticated cyber threats. This talk explores an innovative approach to securing microgrids through the integration of Zero Trust Architecture (ZTA) and Bayesian optimization, ensuring resilience against advanced cyberattacks while maintaining optimal operational efficiency.

The presentation will examine real-world case studies from urban environments (Houston, TX, and Kigali, Rwanda) and rural regions (Northwestern Cameroon and Texas, USA), demonstrating how Zero Trust Architecture (ZTA) principles can enhance security in diverse settings. Dr. Nsoh demonstrates that leveraging Bayesian optimization to dynamically adjust DER operational parameters improves the detection and mitigation of security anomalies, achieving 95%–98% precision in anomaly detection. Additionally, we explore the role of zero-shot learning (ZSL) in identifying zero-day attacks, highlighting its potential to enhance cyber resilience in critical energy systems.

Attendees will gain insights into innovative cybersecurity strategies for energy resilience, the impact of machine learning in anomaly detection, and the broader implications for smart grids and critical infrastructure protection. This seminar is particularly relevant for researchers, engineers, and policymakers interested in the intersection of cybersecurity, artificial intelligence, and critical infrastructure such as energy systems.

Speaker Bio

Dr. Jovita Nsoh is an Assistant Professor of Cybersecurity at the University of Houston’s Cullen College of Engineering and a Lead OT Security Architect at Google. With over 26 years of experience, Dr. Nsoh specializes in cybersecurity, AI, and operational technology (OT) security, focusing on zero-trust architectures, AI-driven anomaly detection, and critical infrastructure protection. His expertise spans cloud security, industrial control systems (ICS), and AI-driven cyber resilience. He holds a Ph.D. in Cybersecurity, an MBA in International Business and Finance, and industry certifications including CISSP, CISM, and CISA. Dr. Nsoh has held leadership roles at Microsoft, JPMorgan Chase, Verizon, ConocoPhillips, and HPE, driving large-scale cybersecurity and digital transformation initiatives. He is a Senior Member of IEEE, he actively contributes to cybersecurity research, policy development, and workforce training.