Creating a browser extension to record and transcribe

I was really excited to see the tutorial on creating the browser extension.

I’m creating a leadership platform where I’m hoping to be able to record and transcribe one-to-one.

The problem I have is it’s easy to record the person with my platform open as we can just record using the browser, but to get the other person is very difficult if they’re using something like Zoom more teams.

What’s my best option for being able to record both and then transcribing it and saving it. Of course this would all be permission based and things like that and it would be great if I could do Google meet, but I’m really not sure if this is possible with and this new browser possibility.

1 Like

I can’t answer your question in terms of building a Replit app to record/transcribe - an interesting project, but I have no experience.

But for simple record/transcribe of calls, I use this paid SaaS tool: Jamie - Your Personal AI Note Taker.

So much for your reply, Steve.

Yeah, I normally use otter or native teams/meet/zoom features

I’m hoping to be able to do is find a way to have version for my own system.

Browser recording and transcribing of both parties is probably the type of tricky thing where I would not roll my own, and would prefer to plug in to something ready-done.

I happen to know that the Jamie founder is working on a set of APIs (he said Q1). And some others probably have them too. Do Wispr offer anything?

I’ve created multiple chrome extensions.

I’ve integrated Claude Code on my desktop to take over the older code bases and it does an excellent job.

If you do create an extension I’d recommend the following tech stack (I personally use)

  • Claude Code with Context7 MCP for WXT and React documentation
  • WXT extension framework with React (creates assets for both Chrome and Firefox)

What’s my best option for being able to record both and then transcribing it and saving it. Of course this would all be permission based and things like that and it would be great if I could do Google meet, but I’m really not sure if this is possible with and this new browser possibility.

  1. Create an API endpoint on your app to receive audio files
  2. Have the extension send the audio file to that API endpoint
  3. Use an LLM audio-to-text like openai whisper (very cheap cost wise)

Your main hurdle will be connecting your app user and your Chrome Extension user. Chrome Extensions have an identity system for Oauth which will allow them to login to your app. You have to custom build it though. You’ll basically have to create a login portal that Chrome can access, then respond with the user’s info and an API key they can use to access your API. You store it in chrome storage and then when ever you need to interact with your app that API key they use can tie them to their account.

Replit shouldn’t have any problem making API keys that tie to a user’s account. Javascript Web Tokens (JWT) will probably be the easiest way and what it would default to. I’d toggle the Agent to Plan mode and have a chat with it about the best approach. :+1:

Absolutely nothing beats Notion for meeting notes/transcription. The usefulness of it when combined with Notion AI is undeniable. Notion remains the one SaaS I will not attempt to replace, because it amplifies my productivity so much in it’s current form, it makes no sense to divert energy to replacing it.

ahah I am totally with you :slight_smile: I use notion for all our organisation of Sero.

What I am hoping to find is a way to record 1:1 meetings within my app.

Then sending the transcript to AI to assess.

Oh I see

MIKE ! Dude,

Thank you so much for this. It really gives me a lot to look into and try.

I will unpack it, try it and see where I get, thanks so much.

C