LlamaIndex - Building a RAG pipeline#
- Load the data
- Transform the data
- Index and store the data
1.1 Loaders#
The way LlamaIndex does this is via data connectors, also called Reader
1
2
3
4
5
6
7
| #读取文件夹:
from llama_index.core import SimpleDirectoryReader
documents = SimpleDirectoryReader("./data").load_data()
#读取单个文档
from llama_index.core import Document
doc = Document(text="text")
|
After the data is loaded, you then need to process and transform your data before putting it into a storage system. These transformations include chunking, extracting metadata, and embedding each chunk. This is necessary to make sure that the data can be retrieved, and used optimally by the LLM.
- 处理和转换数据
- 转换的方式包括 - 分块,提取元数据,嵌入每一个块
Transformation input/outputs are Node
objects (a Document
is a subclass of a Node
). Transformations can also be stacked and reordered.
Indexes have a .from_documents()
method which accepts an array of Document objects
1
2
3
4
5
| # 使用.fron_documents() 接受 document数据并分块
from llama_index.core import VectorStoreIndex
vector_index = VectorStoreIndex.from_documents(documents)
vector_index.as_query_engine()
|
If you want to customize core components, like the text splitter, through this abstraction you can pass in a custom transformations
list or apply to the global Settings
1
2
3
4
5
6
7
8
9
10
11
12
13
| from llama_index.core.node_parser import SentenceSplitter
text_splitter = SentenceSplitter(chunk_size=512, chunk_overlap=10)
# global
from llama_index.core import Settings
Settings.text_splitter = text_splitter
# per-index
index = VectorStoreIndex.from_documents(
documents, transformations=[text_splitter]
)
|
pass
You can also choose to add metadata to your documents and nodes.
1
2
3
4
5
| # 手动添加元数据
document = Document(
text="text",
metadata={"filename": "<doc_file_name>", "category": "<category>"},
)
|
1.4 Creating and passing Nodes directly#
If you want to, you can create nodes directly and pass a list of Nodes directly to an indexer:
1
2
3
4
5
6
7
| # 手动添加node数据
from llama_index.core.schema import TextNode
node1 = TextNode(text="<text_chunk>", id_="<node_id>")
node2 = TextNode(text="<text_chunk>", id_="<node_id>")
index = VectorStoreIndex([node1, node2])
|
It’s time to build an Index
over these objects so you can start querying them.
2.1 What is an Index#
In LlamaIndex terms, an Index
is a data structure composed of Document
objects, designed to enable querying by an LLM. Your Index is designed to be complementary to your querying strategy.
下面只介绍两种常见的索引类型
The Vector Store Index takes your Documents and splits them up into Nodes. It then creates vector embeddings
of the text of every node, ready to be queried by an LLM.
Vector embeddings
are central to how LLM applications function.
1
| 向量嵌入,通常简称为嵌入,是文本语义或含义的数字表示。具有相似含义的两段文本在数学上将具有相似的嵌入,即使实际文本完全不同。
|
By default LlamaIndex uses text-embedding-ada-002
, which is the default embedding used by OpenAI.
When you want to search your embeddings, your query is itself turned into a vector embedding, and then a mathematical operation is carried out by VectorStoreIndex to rank all the embeddings by how semantically similar they are to your query.
1
| 原理:当您想要搜索嵌入向量时,您的查询本身将转换为向量嵌入向量,然后 VectorStoreIndex 执行数学运算,以根据它们与查询的语义相似程度对所有嵌入向量进行排序。
|
pass
2.2 Using Vector Store Index#
通过Documents列表创建
1
2
3
4
5
6
7
| # 要使用 Vector Store Index,请向其传递您在加载阶段创建的 Documents 列表
from llama_index.core import VectorStoreIndex
index = VectorStoreIndex.from_documents(documents)
#from_documents also takes an optional argument show_progress. Set it to True to display a progress bar during index construction.
# 可以添加show_progress=True参数来显示进度条
index = VectorStoreIndex.from_documents(documents,show_progress=True)
|
通过Node列表创建
1
2
3
| from llama_index.core import VectorStoreIndex
index = VectorStoreIndex(nodes)
|
3 Storing#
Why? By default, your indexed data is stored only in memory.
3.1 Persisting to disk#
simplest way:
1
2
| # This works for any type of index.
index.storage_context.persist(persist_dir="<persist_dir>")
|
Composable Graph:
1
| graph.root_index.storage_context.persist(persist_dir="<persist_dir>")
|
loading:
1
2
3
4
5
6
7
| from llama_index.core import StorageContext, load_index_from_storage
# rebuild storage context
storage_context = StorageContext.from_defaults(persist_dir="<persist_dir>")
# load index
index = load_index_from_storage(storage_context)
|
3.2 Using Vector Stores#
chroma#
steps:
- initialize the Chroma client
- create a Collection to store your data in Chroma
- assign Chroma as the
vector_store
in a StorageContext
- initialize your VectorStoreIndex using that StorageContext
Save:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
| import chromadb
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
# load some documents
documents = SimpleDirectoryReader("./data").load_data()
# initialize client, setting path to save data
db = chromadb.PersistentClient(path="./chroma_db")
# create collection
chroma_collection = db.get_or_create_collection("quickstart")
# assign chroma as the vector_store to the context
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# create your index
index = VectorStoreIndex.from_documents(
documents, storage_context=storage_context
)
# create a query engine and query
query_engine = index.as_query_engine()
response = query_engine.query("What is the meaning of life?")
print(response)
|
Load:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
| import chromadb
from llama_index.core import VectorStoreIndex
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import StorageContext
# initialize client
db = chromadb.PersistentClient(path="./chroma_db")
# get collection
chroma_collection = db.get_or_create_collection("quickstart")
# assign chroma as the vector_store to the context
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)
# load your index from stored vectors
index = VectorStoreIndex.from_vector_store(
vector_store, storage_context=storage_context
)
# create a query engine
query_engine = index.as_query_engine()
response = query_engine.query("What is llama2?")
print(response)
|
更多案例 Chroma - LlamaIndex
insert new documents:
1
2
3
4
5
| from llama_index.core import VectorStoreIndex
index = VectorStoreIndex([])
for doc in documents:
index.insert(doc)
|
4 Querying#
4.1 simplest:#
1
2
3
4
5
| query_engine = index.as_query_engine()
response = query_engine.query(
"Write an email to the user given their background information."
)
print(response)
|
4.2 Customizing the stages of querying#
4.2.1 设置tok_k的参数 并 添加后处理#
use a different number for top_k
and add a post-processing step that requires that the retrieved nodes reach a minimum similarity score to be included.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
| from llama_index.core import VectorStoreIndex, get_response_synthesizer
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.postprocessor import SimilarityPostprocessor
# build index
index = VectorStoreIndex.from_documents(documents)
# configure retriever
retriever = VectorIndexRetriever(
index=index,
similarity_top_k=10,
)
# configure response synthesizer
response_synthesizer = get_response_synthesizer()
# assemble query engine
query_engine = RetrieverQueryEngine(
retriever=retriever,
response_synthesizer=response_synthesizer,
node_postprocessors=[SimilarityPostprocessor(similarity_cutoff=0.7)],
)
# query
response = query_engine.query("What did the author do growing up?")
print(response)
|
4.2.2 Configuring retriever#
1
2
3
4
| retriever = VectorIndexRetriever(
index=index,
similarity_top_k=10,
)
|
配置后处理器 Configuring node postprocessors
配置响应输出Configuring response synthesis
相关文献:#
LlamaIndex
Building an LLM Application - LlamaIndex