Variational autoencoders (VAEs) serve as essential components in large generative models for extracting latent representations and have gained widespread application in biological domains. Developing VAEs specifically tailored to the unique characteristics of biological data is crucial for advancing future large-scale biological models.
Through systematic monitoring of VAE training processes across 31 public single-cell datasets spanning oncological and normal conditions, we discovered that reducing the β β value which corresponds to lower disentanglement of VAE significantly improves unsupervised clustering metrics in single-cell data analysis. Based on this finding, we innovatively developed iVAE with an irecon module that, when benchmarked against 8 established dimensionality reduction methods across 5 clustering performance metrics, exhibited superior capabilities in representing single-cell transcriptomic data.