- Published on
The ever-growing volume of research publications necessitates efficient methods for structuring such knowledge. This automated solution uses Machine Learning (UMAP, HDBSCAN), Embedding Quantization, and an LLM pipeline to classify 25,000 arXiv publications under a novel taxonomy.