You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: tensorflow-core-kotlin/tensorflow-core-kotlin-api/src/gen/annotations/org/tensorflow/op/kotlin/AudioOps.kt
+32-32Lines changed: 32 additions & 32 deletions
Original file line number
Diff line number
Diff line change
@@ -47,33 +47,33 @@ public class AudioOps(
47
47
48
48
/**
49
49
* Produces a visualization of audio data over time.
50
-
*
50
+
*
51
51
* Spectrograms are a standard way of representing audio information as a series of
52
52
* slices of frequency information, one slice for each window of time. By joining
53
53
* these together into a sequence, they form a distinctive fingerprint of the sound
54
54
* over time.
55
-
*
55
+
*
56
56
* This op expects to receive audio data as an input, stored as floats in the range
57
57
* -1 to 1, together with a window width in samples, and a stride specifying how
58
58
* far to move the window between slices. From this it generates a three
59
59
* dimensional output. The first dimension is for the channels in the input, so a
60
60
* stereo audio input would have two here for example. The second dimension is time,
61
61
* with successive frequency slices. The third dimension has an amplitude value for
62
62
* each frequency during that time slice.
63
-
*
63
+
*
64
64
* This means the layout when converted and saved as an image is rotated 90 degrees
65
65
* clockwise from a typical spectrogram. Time is descending down the Y axis, and
66
66
* the frequency decreases from left to right.
67
-
*
67
+
*
68
68
* Each value in the result represents the square root of the sum of the real and
69
69
* imaginary parts of an FFT on the current window of samples. In this way, the
70
70
* lowest dimension represents the power of each frequency in the current window,
71
71
* and adjacent windows are concatenated in the next dimension.
72
-
*
72
+
*
73
73
* To get a more intuitive and visual look at what this operation does, you can run
74
74
* tensorflow/examples/wav_to_spectrogram to read in an audio file and save out the
75
75
* resulting spectrogram as a PNG image.
76
-
*
76
+
*
77
77
* @param input Float representation of audio data.
78
78
* @param windowSize How wide the input window is in samples. For the highest efficiency
79
79
* this should be a power of two, but other values are accepted.
0 commit comments