Finally had some success with a clickable highlighting transcript, on the Brazilian version of this site! Thanks to the help of Chat GPT I was able to build HTML code to display a transcript then a quick python script to convert WEBVTT docs into html transcripts.
Still working on some desired functionality (like a toggle between the main site font and the open dyslexic font)
but definitely a success for now. The audio file was generated via murf.ai, I found this voice to be the most realistic. (lucky for me I'm currently in Brazil for the holidays so I had an easy way to get feedback)
This success brought me back to the original Idea I had when I began this site, Or at least one of them. Building a more integrated version of the OtterAI transcripts. The embeded transcripts you see around the site have always been a stop gap until something more advanced was developed.
So to build off the Brazil progress I went back to the Chat GPT: this time asking it if I can change the video tag code to swap the mp3 for a youtube video. Preliminary testing had limited success, (without a youtubeAPI, Its also a bit complicated hiding the API key on a blog post) This is not a deal breaker, but the only issue I had is I already have that part built in via Wix, so long term the solution for 100s of videos and 1000s (hopefully more soon) of users would be to use the Youtube API built into Wix -- so this post is to test just that.
It seems ChatGPT has given me another halicination of code that doesnt quite work. but it might be as simple as just adding the Wix youtube Player API. Will have to call it a night for now and try this with fresh eyes.
Back to the basics, lets just embed a youtube video first. Now that we have that. LEts go back. to the ChatGPT and ask: "Can you help me add a clickable highlighting transcript that will play this youtube video if I use the embed method on youtube share? <iframe width="560" height="315" src="https://www.youtube.com/embed/EQ3GjpGq5Y8" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>"
Hmm seems we are close, now I need to go to youtube, get the transcript and convert it to html.
There are a few ways to do this, for this test I will use the jeffistyping/Youtube-Whisperer located at https://huggingface.co/spaces/jeffistyping/Youtube-Whisperer to get a .srt file and a python script written by ChatGPT to convert the output to html code with timestamps.
Actually, I will first try the glasp.co youtube chrome extension to get a transcript with time stamps and then create a python script (with the help of ChatGPT to convert it to html code)
Never mind, that did not work as intended. back to the .srt file test. Another way to do this is the youtube-vtt tool: https://github.com/EverythingNeurodiversity/youtube-vtt
and that seems to be no longer working or depricated, so I found https://downsub.com/
This provides a nice .srt output of the transcript I will use to create the html code.
Back to the chat GPT for some python:
I will use Jupyter Notebook locally to run this and generate the html.
A few iterations later, were in business. (sometimes chatGPT will cutoff the response) AI is amazing, but imperfect.
Time to put this together. First I have to jump on a meeting with a friend to see if we can create some animations.
After some. more conversations with the ChatGPT bot, I might have found another way to do this and overcome the previous hurdles encountered. Here is a test of just the player embed using the youtube API.
Now back to the transcript part:
of course, the youtube api + the seek element. Ive ben hacking away to this problem for so long, naturally its the most simple solution that would do the trick.
Now I just need a quick python script to create the HTML of it, (and a lot of running of it, there are 550+ posts on this site)
I do want to make sure I can do the opendyslexic font as well first.
Seems to be having trouble with the open dyslexic font. back to the highlight function:
well now ChatGPT is timing out often, will try again soon. Its back, still testing out a few things for UX. for example, it might make sense to play/pause in order to initailize the movie so that the text buttons work without the user hitting play
To do List still includes:
Toggle for open dyslexic font
search of text functionality
python script to create html for long scripts
ability to estimate timing for datapoints not in font already
ability to add in chapters (as Huberman does on his site already) for specific parts of the video.
Getting much closer to what looks like success after a handful of iterations with ChatGPT. sometimes it comes up with wildly complicated ways of solving problems and its important to ask more simple questions to find out why then work from there. This was the case for the highlighting changes I needed. I will try to post the conversation I had with ChatGPT when I can to illustrate what I am referring too. for now, here is the latest test.
Open Dysliexia Font option
need to clean up spacing and display of text
this solution does not allow or google to index the transcript - need to think about that a little bit if there is another way to do this
center the display of the video
ability to automate grabbing of .srt file and creaton of html code via python script.