zstanjj/HTML-Pruner-Phi-3.8B
The zstanjj/HTML-Pruner-Phi-3.8B is a 3.8 billion parameter model developed by Jiejun Tan, Zhicheng Dou, Wen Wang, Mang Wang, Weipeng Chen, and Ji-Rong Wen, designed for two-step block-tree-based HTML pruning within Retrieval Augmented Generation (RAG) systems. It is specifically optimized to reduce the context length of HTML documents while retaining semantic information, improving RAG system efficiency. This model is part of the HtmlRAG framework, which leverages HTML structure for better knowledge modeling in RAG.