AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / Unstructured

Unstructured

Active
GitHub HTML Apache-2.0

Description

Unstructured provides document parsing and cleaning capabilities, commonly used in RAG ingestion and preprocessing pipelines.

Tags

rag document-processing ingestion python

Categories

📚 RAG Tools
Visit GitHub Visit Website View Docs

Project Metrics

Stars 14.2k
Forks 1.2k
Watchers 14.2k
Issues 239
Created September 26, 2022
Last commit March 4, 2026

Deployment

Local

Related Projects

Chroma

26.6k · Rust
Active

Chroma is an open-source AI-native embedding database designed for building LLM applications. It provides simple APIs to store embeddings and perform similarity search, making it ideal for RAG applications.

vector-databaseragembedding +1

Haystack

24.5k · MDX
Active

Haystack is an enterprise-grade framework for RAG and search applications, covering document processing, retrieval, generation, and evaluation end to end.

ragretrievalllm +1

LlamaIndex

47.7k · Python
Active

LlamaIndex is a data framework that provides the data connection layer for LLM applications, with strong RAG capabilities across diverse data sources and vector databases.

ragllmindexing +1

GPT Researcher

25.7k · Python
Active

GPT Researcher is an autonomous research agent that can gather, organize, and analyze information to produce detailed research reports.

researchagentrag +1
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community