Replies: 1 comment
-
|
I am looking for the same problem, did you ever find a solution to this problem, or did you end up using another software? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I'm using silence detection to determine start and end blocks of transcribe through whisper, but I have an inherent issue where transcribed blocks around silence don't have the "correct" start time for the transcribed text.
i.e.
Say we have a block of 30s of audio...
00:05->00:06 Hi there
(silence) .. for a few seconds
00:06->00:11 Yeah I'm good thanks
^^^ The problem in this case is that the "Yeah" really starts at the 9 or 10s mark.
Is there some sort of setting in the transcription that causes this, or is this down to the model etc and just generally how it works?
Beta Was this translation helpful? Give feedback.
All reactions