Unstructured-IO · Paul-Cornell · Apr 15, 2025 · Apr 16, 2025 · Apr 16, 2025 · Apr 18, 2025
diff --git a/open-source/ingestion/supported-file-types.mdx b/open-source/ingestion/supported-file-types.mdx
@@ -4,6 +4,6 @@ title: Supported file types
 
 The Unstructured Ingest CLI and Unstructured Ingest Python library support processing of the following file types:
 
-import SupportedFileTypesPlatform from '/snippets/general-shared-text/supported-file-types-platform.mdx';
+import SupportedFileTypesPlatform from '/snippets/general-shared-text/supported-file-types-open-source.mdx';
 
 <SupportedFileTypesPlatform />
diff --git a/snippets/general-shared-text/platform-partitioning-strategies.mdx b/snippets/general-shared-text/platform-partitioning-strategies.mdx
@@ -7,13 +7,14 @@ strategies other than **Auto** for sets of documents of different types could pr
 including reduction in transformation quality.
 
 - **VLM**: For the highest-quality transformation of these file types: `.bmp`, `.gif`, `.heic`, `.jpeg`, `.jpg`, `.pdf`, `.png`, `.tiff`, and `.webp`.
-- **High Res**: For all other [supported file types](/ui/supported-file-types), and for the generation of bounding box coordinates.
+- **High Res**: For all other [supported file types](/ui/supported-file-types) except video and audio files, and for the generation of bounding box coordinates.
 - **Fast**: For text-only documents.
+- **Multimedia**: For video and audio files.
 
-The **Auto** partitioning strategy routes each file as a complete unit to the appropriate partitioning strategy (**VLM**, **High Res**, or **Fast**) 
+The **Auto** partitioning strategy routes each file as a complete unit to the appropriate partitioning strategy (**VLM**, **High Res**, **Fast**, or **Multimedia**) 
 based on the preceding file types. Additionally, for `.pdf` files, the **Auto** partitioning strategy routes these files' pages 
 on a page-by-page basis, as follows:
 
 - A page is routed to **Fast** when it contains only embedded text and no images or tables are detected.
 - All other kinds of pages are routed to **VLM** or **High Res**, depending on the complexity of a page's 
-  content. Unstructured constantly optimizes its proprietary algorithm for routing to **VLM** or **High Res** in these cases.
+  content. Unstructured constantly optimizes its proprietary algorithm for routing to **VLM** or **High Res** in these cases.
diff --git a/snippets/general-shared-text/supported-file-types-platform.mdx b/snippets/general-shared-text/supported-file-types-platform.mdx
@@ -2,6 +2,8 @@ By file extension:
 
 | File extension |
 | --- |
+| `.3gp` |
+| `.aac` |
 | `.abw` |
 | `.bmp` |
 | `.csv` |
@@ -16,21 +18,32 @@ By file extension:
 | `.epub` |
 | `.et` |
 | `.eth` |
+| `.flac` |
+| `.flv` |
 | `.fods` |
 | `.heic` |
-| `.htm` | 
+| `.htm` |
 | `.html` |
 | `.hwp` |
 | `.jpeg` |
 | `.jpg` |
+| `.m4a` |
 | `.md` |
 | `.mcw` |
+| `.mov` |
+| `.mp3` |
+| `.mp4` |
+| `.mpeg` |
+| `.mpg` |
 | `.msg` |
 | `.mw` |
 | `.odt` |
+| `.ogg` |
+| `.opus` |
 | `.org` |
 | `.p7s` |
 | `.pbd` |
+| `.pcm` |
 | `.pdf` |
 | `.png` |
 | `.pot` |
@@ -45,6 +58,9 @@ By file extension:
 | `.tiff` |
 | `.txt` |
 | `.tsv` |
+| `.wav` |
+| `.webm` |
+| `.wmv` |
 | `.xls` |
 | `.xlsx` |
 | `.xml` |
@@ -54,6 +70,7 @@ By file type:
 
 | Category | File types |
 | --- | --- |
+| Audio | `.aac`, `.flac`, `.m4a`, `.mp3`, `.mp4`, `.ogg`, `.opus`, `.pcm`, `.wav`, `.webm` |
 | Apple | `.cwk`, `.mcw`
 | CSV | `.csv` |
 | Data Interchange | `.dif`* |
@@ -74,6 +91,7 @@ By file type:
 | Spreadsheet | `.et`, `.fods`, `.mw`, `.xls`, `.xlsx` |
 | StarOffice | `.sxg` |
 | TSV | `.tsv` |
+| Video | `.3gp`, `.flv`, `.mov`, `.mp4`, `.mpeg`, `.mpg`, `.webm`, `.wmv` |
 | Word processing | `.abw`, `.doc`, `.docx`, `.dot`, `.dotm`, `.hwp`, `.zabw` |
 | XML | `.xml` |
 

diff --git a/ui/document-elements.mdx b/ui/document-elements.mdx
@@ -52,23 +52,29 @@ of the file and not care about its headers and footers. You can easily filter ou
 Here are some examples of the element types your file might contain:
 
 | Element type        | Description                                                                                                                                          |
-|---------------------|------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `Address`           | A text element for capturing physical addresses.                                                                                                     |
-| `CodeSnippet`       | A text element for capturing code snippets.                                                                                                          |
-| `EmailAddress`      | A text element for capturing email addresses.                                                                                                        |
-| `FigureCaption`     | An element for capturing text associated with figure captions.                                                                                       |
-| `Footer`            | An element for capturing document footers.                                                                                                           |
-| `FormKeysValues`    | An element for capturing key-value pairs in a form.                                                                                                  | 
-| `Formula`           | An element containing formulas in a file.                                                                                                            |
-| `Header`            | An element for capturing document headers.                                                                                                           |
-| `Image`             | A text element for capturing image metadata.                                                                                                         |
-| `ListItem`          | `ListItem` is a `NarrativeText` element that is part of a list.                                                                                      |
-| `NarrativeText`     | `NarrativeText` is an element consisting of multiple, well-formulated sentences. This excludes elements such titles, headers, footers, and captions. |
-| `PageBreak`         | An element for capturing page breaks.                                                                                                                |
-| `PageNumber`        | An element for capturing page numbers.                                                                                                               |
-| `Table`             | An element for capturing tables.                                                                                                                     |
-| `Title`             | A text element for capturing titles.                                                                                                                 |
-| `UncategorizedText` | Base element for capturing free text from within files. Applies to extracted text not associated with bounding boxes if the input is a PDF file.     |
+|--------------------- |------------------------------------------------------------------------------------------------------------------------------------------------------|
+| `Address`            | A text element for capturing physical addresses.                                                                                                     |
+| `CodeSnippet`        | A text element for capturing code snippets.                                                                                                          |
+| `EmailAddress`       | A text element for capturing email addresses.                                                                                                        |
+| `FigureCaption`      | An element for capturing text associated with figure captions.                                                                                       |
+| `Footer`             | An element for capturing document footers.                                                                                                           |
+| `FormKeysValues`     | An element for capturing key-value pairs in a form.                                                                                                  | 
+| `Formula`            | An element containing formulas in a file.                                                                                                            |
+| `Header`             | An element for capturing document headers.                                                                                                           |
+| `Image`              | A text element for capturing image metadata.                                                                                                         |
+| `ListItem`           | `ListItem` is a `NarrativeText` element that is part of a list.                                                                                      |
+| `NarrativeText`      | `NarrativeText` is an element consisting of multiple, well-formulated sentences. This excludes elements such titles, headers, footers, and captions. |
+| `PageBreak`          | An element for capturing page breaks.                                                                                                                |
+| `PageNumber`         | An element for capturing page numbers.                                                                                                               |
+| `SceneDescription`   | An element for capturing scene descriptions, for example a description of a scene in a video.                                                        |
+| `Table`              | An element for capturing tables.                                                                                                                     |
+| `Title`              | A text element for capturing titles.                                                                                                                 |
+| `TranscriptFragment` | An element for capturing transcription of speech, for example a speaker's words in an audio clip or video.                                           |    
+| `UncategorizedText`  | Base element for capturing free text from within files. Applies to extracted text not associated with bounding boxes if the input is a PDF file.     |
+
+<Note>
+    `SceneDescription` and `TranscriptFragment` are specific to video and audio file processing, which is available only for [self-hosted](/self-hosted/overview) deployments of Unstructured.
+</Note>
 
 If you apply chunking, you will also see the `CompositeElement` type. 
 `CompositeElement` is a chunk formed from text (non-`Table`) elements. 
@@ -187,6 +193,19 @@ file.
 Headers and footers in Word files include a `header_footer_type` indicating which page a header or footer applies to.
 Valid values are `"primary"`, `"even_only"`, and `"first_page"`.
 
+#### Video files
+
+Elements for video files include a `start_time` and `end_time`, representing the start and end times of a clip of video 
+from the parent video file to which this element belongs. Also included are the `model_version` representing the model that was used to 
+generate the element, and the `average_log_probability` representing the model's overall average confidence level for the model's output across the document, with values closer to 
+zero indicating higher confidence.
+
+#### Audio files
+
+Elements for audio files include a `start_time`, `end_time`, and `speaker`, representing the start and end times of a clip of audio 
+made by a specific speaker, as part of the parent audio file to which this element belongs. 
+If the speaker cannot be determined, `speaker` is set to `0` or `unknown`.
+
 ### Table-specific metadata
 
 For `Table` elements, the raw text of the table will be stored in the `text` attribute for the element, and HTML representation

diff --git a/ui/workflows.mdx b/ui/workflows.mdx
@@ -69,6 +69,7 @@ By default, this workflow partitions, chunks, and generates embeddings as follow
   - If the page or document has no images and likely does not have tables, **Fast** partitioning is used, and the page or document is billed at the **Fast** rate for processing.
   - If the page or document has only a few tables or images with standard layouts and languages, **High Res** partitioning is used, and the page or document is billed at the **High Res** rate for processing.
   - If the page or document has more than a few tables or images, **VLM** partitioning is used, and the page or document is billed at the **VLM** rate for processing.
+  - If the page or document is a video or audio file, **Multimedia** partitioning is used.
 
   [Learn about partitioning strategies](/ui/partitioning).