Ongoing research project investigating the usage of AI-enabled RAG systems to build knowledge graphs and conceptual hierarchies. Current functionality includes identifying concepts, outcomes, main topics, main topic relationships from text sections/chapters, building and visualizing knowledge graphs and hierarchies, and evaluating the LLM output of concepts and outcomes (if ground truth values are available).
The knowledge graphs and conceptual hierachies are only prototypes and are still in development, so please do not expect anything extravagant yet. However, functions to generate concepts and outcomes from text are seem good from our testing and observations.
We also offer language model, metric, and retriever classes that can be use standalone. You'll need some text (raw text, files, or links) to get started with these.
The main class, RAGKGGenerator, needs a language model and some text to get started. Although its optional, you can also provide a course syllabus, retriever type, and sentence transformer if you do not want the defaults.
If you only wish to build a knowledge graph, we offer two main functions to do this for you:
One of them, text_pipeline(), generates a knowledge graph from some text. It doesn't take any arguments, you only need to instantiate a RAGKGGenerator to use it.
The other, syllabus_pipeline(), uses the textbook and syllabus to generate a knowledge graph. Again, it doesn't take any arguments, but for this to work you must provide a syllabus when instantiating the RAGKGGenerator class.
To build a knowledge graph from text:
gen = RAGKGGenerator(
chapters = <chapters>, # put your text sections/chapters here
llm = <your llm>, # put your llm here, it MUST inherit from DeepEvalLLM. We offer three prebuilts in src.llms
texts = <your text here>, # text here. either provide a list or one
)
kg = gen.text_pipeline() # this generates an html file in ./visualizations
Building a knowledge from a syllabus and text is very similar. You will need some syllabus.
gen = RAGKGGenerator(
chapters = <chapters>, # put your text sections/chapters here
llm = <your llm>, # put your llm here, it MUST inherit from DeepEvalLLM. We offer three prebuilts in src.llms
texts = <your text here>, # text here. either provide a list or one (raw text, files, and links are fine.)
syllabus = <syllabus here>, # text, file, or link
)
kg = gen.syllabus_pipeline() # this generates an html file in ./visualizations
If you would like to use functions independently:
gen = RAGKGGenerator(
chapters = <chapters>, # put your text sections/chapters here
llm = <your llm>, # put your llm here, it MUST inherit from DeepEvalLLM. We offer three prebuilts in src.llms
texts = <your text here>, # text here. either provide a list or one (raw text, files, and links are fine.)
syllabus = <syllabus here>, # text, file, or link
)
gen.identify_concepts(5) # generates 5 concepts per text section/chapter
gen.identify_outcomes(5) # generates 5 outcomes per text section/chapter
gen.summarize() # if the text is short and fits in the llm context window, this can summarize it
gen.objectives_from_syllabus() # gets course objectives from provided syllabus
...
and more!
Using conda:
conda env create -f environment.yml
On Windows and Linux, activate conda virtual environment using:
conda activate rag_kg_generation