[DOCS] more proper names and glossary terms (#1042)

* fix some proper names * nits * too many, giving up * remove _ from mkdocs * llama indexes --------- Co-authored-by: Josh Reini <[email protected]>
truera · Apr 2, 2024 · 94088be · 94088be
1 parent 8a84b49
commit 94088be
Show file tree

Hide file tree

Showing 52 changed files with 228 additions and 160 deletions.
diff --git a/docs/trulens_eval/all_tools.ipynb b/docs/trulens_eval/all_tools.ipynb
@@ -5,7 +5,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 📓 Langchain Quickstart\n",
+    "# 📓 _LangChain_ Quickstart\n",
     "\n",
     "In this quickstart you will create a simple LLM Chain and learn how to log it and get feedback on an LLM response.\n",
     "\n",
@@ -60,7 +60,7 @@
     "tru = Tru()\n",
     "tru.reset_database()\n",
     "\n",
-    "# Imports from langchain to build app\n",
+    "# Imports from LangChain to build app\n",
     "import bs4\n",
     "from langchain import hub\n",
     "from langchain.chat_models import ChatOpenAI\n",
@@ -429,7 +429,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 📓 Llama-Index Quickstart\n",
+    "# 📓 _LlamaIndex_ Quickstart\n",
     "\n",
     "In this quickstart you will create a simple Llama Index app and learn how to log it and get feedback on an LLM response.\n",
     "\n",

diff --git a/docs/trulens_eval/api/app/trurails.md b/docs/trulens_eval/api/app/trurails.md
@@ -1,7 +1,4 @@
-# Tru Rails (NEMO Guardrails)
-
-!!! Warning
-    This recorder is experimental.
+# Tru Rails for _NeMo Guardrails_
 
 ::: trulens_eval.tru_rails.TruRails
 

diff --git a/docs/trulens_eval/api/provider/langchain.md b/docs/trulens_eval/api/provider/langchain.md
@@ -1,12 +1,12 @@
-# 🦜️🔗 Langchain Provider
+# 🦜️🔗 _LangChain_ Provider
 
-Below is how you can instantiate a [Langchain LLM](https://python.langchain.com/docs/modules/model_io/llms/) as a provider.
+Below is how you can instantiate a [_LangChain_ LLM](https://python.langchain.com/docs/modules/model_io/llms/) as a provider.
 
 All feedback functions listed in the base [LLMProvider
-class][trulens_eval.feedback.provider.base.LLMProvider] can be run with the Langchain Provider.
+class][trulens_eval.feedback.provider.base.LLMProvider] can be run with the _LangChain_ Provider.
 
 !!! note
 
-    Langchain provider cannot be used in `deferred` mode due to inconsistent serialization capabilities of langchain apps.
+    _LangChain_ provider cannot be used in `deferred` mode due to inconsistent serialization capabilities of _LangChain_ apps.
 
 ::: trulens_eval.feedback.provider.langchain.Langchain
diff --git a/docs/trulens_eval/contributing/design.md b/docs/trulens_eval/contributing/design.md
@@ -36,7 +36,7 @@ function [jsonify][trulens_eval.utils.json.jsonify] is the root of this process.
 Classes inheriting [BaseModel][pydantic.BaseModel] come with serialization
 to/from json in the form of [model_dump][pydantic.BaseModel.model_dump] and
 [model_validate][pydantic.BaseModel.model_validate]. We do not use the
-serialization to json part of this capability as a lot of langchain components
+serialization to json part of this capability as a lot of _LangChain_ components
 are tripped to fail it with a "will not serialize" message. However, we use make
 use of pydantic `fields` to enumerate components of an object ourselves saving
 us from having to filter out irrelevant internals that are not declared as
@@ -77,7 +77,7 @@ various classes.
 
 ##### pydantic (langchain)
 
-Most if not all langchain components use pydantic which imposes some
+Most if not all _LangChain_ components use pydantic which imposes some
 restrictions but also provides some utilities. Classes inheriting
 [BaseModel][pydantic.BaseModel] do not allow defining new attributes but
 existing attributes including those provided by pydantic itself can be
@@ -196,7 +196,7 @@ functions that seem to not involve [Task][asyncio.Task] do use tasks, such as
   TODO(piotrm): This might have been fixed. Check.
 
 - Some apps cannot be serialized/jsonized. Sequential app is an example. This is
-  a limitation of langchain itself.
+  a limitation of _LangChain_ itself.
 
 - Instrumentation relies on CPython specifics, making heavy use of the
   [inspect][] module which is not expected to work with other Python
@@ -235,7 +235,7 @@ stack for specific frames:
 
 #### Alternatives
 
-- [contextvars][] -- langchain uses these to manage contexts such as those used
+- [contextvars][] -- _LangChain_ uses these to manage contexts such as those used
   for instrumenting/tracking LLM usage. These can be used to manage call stack
   information like we do. The drawback is that these are not threadsafe or at
   least need instrumenting thread creation. We have to do a similar thing by

diff --git a/docs/trulens_eval/contributing/standards.md b/docs/trulens_eval/contributing/standards.md
@@ -5,15 +5,28 @@ Enumerations of standards for code and its documentation to be maintained in
 
 ## Proper Names
 
-Styling/formatting of proper names in italics.
+In natural language text, style/format proper names using italics if available.
+In Markdown, this can be done with a single underscore character on both sides
+of the term. In unstyled text, use the capitalization as below. This does not
+apply when referring to things like package names, classes, methods.
 
-- _TruLens_
+- _TruLens_, _TruLens-Eval_, _TruLens-Explain_
 
 - _LangChain_
 
 - _LlamaIndex_
 
-- _NeMo Guardrails_, _Guardrails_ for short, _rails_ for shorter.
+- _NeMo Guardrails_
+
+- _OpenAI_
+
+- _Bedrock_
+
+- _LiteLLM_
+
+- _Pinecone_
+
+- _HuggingFace_
 
 ## Python
 

diff --git a/docs/trulens_eval/evaluation/feedback_providers/index.md b/docs/trulens_eval/evaluation/feedback_providers/index.md
@@ -23,7 +23,7 @@ Providers which use large language models for feedback evaluation:
   [AzureOpenAI provider][trulens_eval.feedback.provider.openai.AzureOpenAI]
 - [Bedrock provider][trulens_eval.feedback.provider.bedrock.Bedrock]
 - [LiteLLM provider][trulens_eval.feedback.provider.litellm.LiteLLM]
-- [Langchain provider][trulens_eval.feedback.provider.langchain.Langchain]
+- [_LangChain_ provider][trulens_eval.feedback.provider.langchain.Langchain]
 
 Feedback functions in common across these providers are in their abstract class
 [LLMProvider][trulens_eval.feedback.provider.base.LLMProvider].

diff --git a/docs/trulens_eval/evaluation/feedback_selectors/selecting_components.md b/docs/trulens_eval/evaluation/feedback_selectors/selecting_components.md
@@ -108,11 +108,11 @@ The top level record also contains these helper accessors
 
 - `RecordInput = Record.main_input` -- points to the main input part of a
   Record. This is the first argument to the root method of an app (for
-  langchain Chains this is the `__call__` method).
+  _LangChain_ Chains this is the `__call__` method).
 
 - `RecordOutput = Record.main_output` -- points to the main output part of a
   Record. This is the output of the root method of an app (i.e. `__call__`
-  for langchain Chains).
+  for _LangChain_ Chains).
 
 - `RecordCalls = Record.app` -- points to the root of the app-structured
   mirror of calls in a record. See **App-organized Calls** Section above.

diff --git a/docs/trulens_eval/evaluation/feedback_selectors/selector_shortcuts.md b/docs/trulens_eval/evaluation/feedback_selectors/selector_shortcuts.md
@@ -33,7 +33,7 @@ Several utility methods starting with `.on` provide shorthands:
 
 Some wrappers include additional shorthands:
 
-### Llama-Index specific selectors
+### LlamaIndex specific selectors
 
 - `TruLlama.select_source_nodes()` -- outputs the selector of the source
   documents part of the engine output.
@@ -55,7 +55,7 @@ Some wrappers include additional shorthands:
   context = TruLlama.select_context(query_engine)
   ```
 
-### LangChain specific selectors
+### _LangChain_ specific selectors
 
 - `TruChain.select_context()` -- outputs the selector of the context part of the
   engine output.
@@ -67,10 +67,10 @@ Some wrappers include additional shorthands:
   context = TruChain.select_context(retriever_chain)
   ```
 
-### Llama-Index and Langchain specific selectors
+### _LlamaIndex_ and _LangChain_ specific selectors
 
 - `App.select_context()` -- outputs the selector of the context part of the
-  engine output. Can be used for both Llama-Index and Langchain apps.
+  engine output. Can be used for both _LlamaIndex_ and _LangChain_ apps.
 
   Usage:
 

diff --git a/docs/trulens_eval/getting_started/core_concepts/index.md b/docs/trulens_eval/getting_started/core_concepts/index.md
@@ -10,8 +10,11 @@
 
 General and 🦑_TruLens-Eval_-specific concepts.
 
-- `Agent`. A `Component` of an `Application` that performs some related set of
-  tasks potentially interfacing with some external services or APIs.
+- `Agent`. A `Component` of an `Application` or the entirety of an application
+  that providers a natural language interface to some set of capabilities
+  typically incorporating `Tools` to invoke or query local or remote services,
+  while maintaining its state via `Memory`. The user of an agent may be a human, a
+  tool, or another agent. See also `Multi Agent System`.
 
 - `Application` or `App`. An "application" that is tracked by 🦑_TruLens-Eval_.
   Abstract definition of this tracking corresponds to
@@ -26,10 +29,24 @@ General and 🦑_TruLens-Eval_-specific concepts.
 
 - `Chain`. A _LangChain_ `App`.
 
+- `Chain of Thought`. The use of an `Agent` to deconstruct its tasks and to
+  structure, analyze, and refine its `Completions`.
+
 - `Completion`, `Generation`. The process or result of LLM responding to some
   `Prompt`.
 
-- `Component`. Part of an `Application`.
+- `Component`. Part of an `Application` giving it some capability. Typical
+  components include:
+
+  - `Retriever`
+
+  - `Memory`
+
+  - `Tool`
+
+  - `Prompt Template`
+
+  - `LLM`
 
 - `Embedding`. A real vector representation of some piece of text. Can be used
   to find related pieces of text in a `Retrieval`.
@@ -48,11 +65,35 @@ General and 🦑_TruLens-Eval_-specific concepts.
 - `Human Feedback`. A feedback that is provided by a human, e.g. a thumbs
   up/down in response to a `Completion`.
 
+- `Instruction Prompt`, `System Prompt`. A part of a `Prompt` given to an `LLM`
+  to complete that contains instructions describing the task that the
+  `Completion` should solve. Sometimes such prompts include examples of correct
+  or desirable completions (see `Shots`). A prompt that does not include examples
+  is said to be `Zero Shot`.
+
 - `LLM`, `Large Language Model`. The `Component` of an `Application` that
   performs `Completion`.
 
+- `Memory`. The state maintained by an `Application` or an `Agent` indicating
+  anything relevant to continuing, refining, or guiding it towards its
+  goals. `Memory` is provided as `Context` in `Prompts` and is updated when new
+  relevant context is processed, be it a user prompt or the results of the
+  invocation of some `Tool`. As `Memory` is included in `Prompts`, it can be a
+  natural language description of the state of the app/agent. To limit to size
+  if memory, `Summarization` is often used.
+
+- `Multi-Agent System`. The use of multiple `Agents` incentivized to interact
+  with each other to implement some capability. While the term predates `LLMs`,
+  the convenience of the common natural language interface makes the approach
+  much easier to implement.
+
 - `Prompt`. The text that an `LLM` completes during `Completion`. In chat
-  applications, the user's message.
+  applications. See also `Instruction Prompt`, `Prompt Template`.
+
+- `Prompt Template`. A piece of text with placeholders to be filled in in order
+  to build a `Prompt` for a given task. A `Prompt Template` will typically
+  include the `Instruction Prompt` with placeholders for things like `Context`,
+  `Memory`, or `Application` configuration parameters.
 
 - `Provider`. A system that _provides_ the ability to execute models, either
   `LLM`s or classification models. In 🦑_TruLens-Eval_, `Feedback Functions`
@@ -73,18 +114,36 @@ General and 🦑_TruLens-Eval_-specific concepts.
     !!! note
         This will be renamed to `Trace` in the future.
 
-- `Retrieval`. The process or result of looking up pieces of context relevant to
-  some query. Typically this is done using an `Embedding` reprqesentations of
-  queries and contexts.
+- `Retrieval`, `Retriever`. The process or result (or the `Component` that
+  performs this) of looking up pieces of text relevant to a `Prompt` to provide
+  as `Context` to an `LLM`. Typically this is done using an `Embedding`
+  representations.
 
 - `Selector` (🦑_TruLens-Eval_-specific concept). A specification of the source
   of data from a `Trace` to use as inputs to a `Feedback Function`. This
   corresponds to [Lens][trulens_eval.utils.serial.Lens] and utilities
   [Select][trulens_eval.schema.Select].
 
+- `Shot`, `Zero Shot`, `Few Shot`, `<Quantity>-Shot`. The use of zero or more
+  examples in an `Instruction Prompt` to help an `LLM` generate desirable
+  `Completions`. `Zero Shot` describes prompts that do not have any examples and
+  only offer a natural language description of the task, while `<Quantity>-Shot`
+  indicate some `<Quantity>` of examples are provided.
+
 - `Span`. Some unit of work logged as part of a record. Corresponds to current
   🦑[RecordAppCallMethod][trulens_eval.schema.RecordAppCall].
 
-- `Tool`. See `Agent`.
+- `Summarization`. The task of condensing some natural language text into a
+  smaller bit of natural language text that preserves the most important parts
+  of the text. This can be targetted towards humans or otherwise. It can also be
+  used to maintain consize `Memory` in an `LLM` `Application` or `Agent`.
+  Summarization can be performed by an `LLM` using a specific `Instruction Prompt`.
+
+- `Tool`. A piece of functionality that can be invoked by an `Application` or
+  `Agent`. This commonly includes interfaces to services such as search (generic
+  search via google or more specific like IMDB for movies). Tools may also
+  perform actions such as submitting comments to github issues. A `Tool` may
+  also encapsulate an interface to an `Agent` for use as a component in a larger
+  `Application`.
 
 - `Trace`. See `Record`.
diff --git a/docs/trulens_eval/tracking/instrumentation/index.ipynb b/docs/trulens_eval/tracking/instrumentation/index.ipynb
@@ -18,7 +18,7 @@
     "  LLM app.\n",
     "* TruChain instruments LangChain apps. [Read\n",
     "  more](langchain).\n",
-    "* TruLlama instruments Llama-Index apps. [Read\n",
+    "* TruLlama instruments LlamaIndex apps. [Read\n",
     "  more](llama_index).\n",
     "* TruRails instruments NVIDIA Nemo Guardrails apps. [Read more](nemo).\n",
     "\n",

diff --git a/docs/trulens_eval/tracking/instrumentation/langchain.ipynb b/docs/trulens_eval/tracking/instrumentation/langchain.ipynb
@@ -4,12 +4,12 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# 📓 🦜️🔗 LangChain Integration\n",
+    "# 📓 🦜️🔗 _LangChain_ Integration\n",
     "\n",
-    "TruLens provides TruChain, a deep integration with LangChain to allow you to\n",
-    "inspect and evaluate the internals of your application built using LangChain.\n",
-    "This is done through the instrumentation of key LangChain classes. To see a list\n",
-    "of classes instrumented, see *Appendix: Instrumented LangChain Classes and\n",
+    "TruLens provides TruChain, a deep integration with _LangChain_ to allow you to\n",
+    "inspect and evaluate the internals of your application built using _LangChain_.\n",
+    "This is done through the instrumentation of key _LangChain_ classes. To see a list\n",
+    "of classes instrumented, see *Appendix: Instrumented _LangChain_ Classes and\n",
     "Methods*.\n",
     "\n",
     "In addition to the default instrumentation, TruChain exposes the\n",
@@ -41,7 +41,7 @@
     "from langchain.prompts.chat import HumanMessagePromptTemplate, ChatPromptTemplate\n",
     "from trulens_eval import TruChain\n",
     "\n",
-    "# typical langchain rag setup\n",
+    "# typical LangChain rag setup\n",
     "full_prompt = HumanMessagePromptTemplate(\n",
     "    prompt=PromptTemplate(\n",
     "        template=\n",
@@ -157,7 +157,7 @@
    "source": [
     "## Async Support\n",
     "\n",
-    "TruChain also provides async support for Langchain through the `acall` method. This allows you to track and evaluate async and streaming LangChain applications.\n",
+    "TruChain also provides async support for _LangChain_ through the `acall` method. This allows you to track and evaluate async and streaming _LangChain_ applications.\n",
     "\n",
     "As an example, below is an LLM chain set up with an async callback."
    ]