Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webvtt example and SIP for av content #5

Merged
merged 2 commits into from
Jun 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion digitization/av/av_bestpractices.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,4 +48,20 @@ Preservation master files should represent the entirety of the source material.
| --------------- | ---------------- |
| MPEG-4 (MP4) | H.264 |


## Submission Information Packet for AV content

Digitized av content is often multiple components. These components should be packed in a single submission information packet for ingest into SCRC digital storage.

Example SIP for a/v content:


```
ms2374_s2_c107d_f7_i1
├── ms2374_s2_c107d_f7_i1_001.mov
├── ms2374_s2_c107d_f7_i1_002.mov
└── access
├── ms2374_s2_c107d_f7_i1_001.mp4
├── ms2374_s2_c107d_f7_i1_001_eng.vtt
├── ms2374_s2_c107d_f7_i1_002.mp4
└── ms2374_s2_c107d_f7_i1_002_eng.vtt
```
8 changes: 4 additions & 4 deletions digitization/av/av_digitization.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,9 @@ nav_order: 2
---
# Audio and Video Reproduction

Audio cassette tapes, video tapes, CDs, DVDs, and other audio-visual formats must be digitally reproduced to facilitate research access. We do not permit researchers to listen or view any electronic media in its original form.
Audio cassette tapes, video tapes, CDs, DVDs, and other audio-visual formats must be digitally reproduced to facilitate research access. We do not permit researchers to listen or view any electronic a/v media in its original form.

## Reproduction of Material Via Vendor
Special Collections is not able to digitize all a/v formats. For any requests where Special Collections' on-site equipment cannot handle materials, we may be able to send those materials to a vendor to digitize. Material in poor condition that requires conservation or specialized expertise and or equipment may also be sent to vendors.
## Reproduction of Materials via Vendor
The Special Collections Research Center is not able to digitize all a/v formats. For any requests where Special Collections' on-site equipment cannot handle materials, we may be able to send those materials to a vendor to digitize. Material in poor condition that requires conservation or specialized expertise and or equipment may also be sent to vendors.

Also, large requests or requests that cannot be filled in a reasonable period of time with available staff time may also be sent to vendors.
Large user requests or requests that cannot be filled in a reasonable period of time may also be sent to vendors.
42 changes: 42 additions & 0 deletions digitization/av/captions.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,5 +41,47 @@ See [Guidelines for Embedding Metadata in WebVTT Files (FADGI)](https://www.digi
- Language
- Originating File

# Example WebVTT

```
WEBVTT
Type: caption
Language: eng
Responsible Party: US, GWU/SCRC
File Creator: GWU/SCRC
File Creation Date: 2024-05-22
Originating File: rg0131_s04_ss02_c02_i04.mp4

00:01:48.000 --> 00:01:50.000
Thank you very much.

00:01:50.000 --> 00:01:55.000
And now we are going into our next topic.

00:01:55.000 --> 00:02:00.000
And it's beyond the next election.

00:02:00.000 --> 00:02:08.000
And you know the next president of the United States will be sworn in, will have been sworn in,

00:02:08.000 --> 00:02:14.000
a year, one year, one week, and one day from now.

00:02:14.000 --> 00:02:18.000
Not quite one day, 22 hours.

00:02:18.000 --> 00:02:29.000
And that is a little too long to predict what, not only who he'll be and what party,

00:02:29.000 --> 00:02:35.000
but also what the conditions are under which he will take over,

00:02:35.000 --> 00:02:41.000
except that based on the last half, last dozen or so inaugurations,

00:02:41.000 --> 00:02:45.000
you can predict a snowstorm in Washington for the day.
```

---
[^1]: Dave Rodriguez, Bryan J. Brown, and Florida State University Libraries, “Comparative Analysis of Automated Speech Recognition Technologies for Enhanced Audiovisual Accessibility,” The Code4Lib Journal, no. 58 (December 4, 2023), https://journal.code4lib.org/articles/17820.