Lessons from Building ActSolo.AI - An AI Teleprompter and Scene Partner for My Wife
131 messages sent, 73 AI edits, and some extra ‘credits’ later, I’ve reached the end of a 2-week sprint in my first project with Loveable.
The inspiration started at home: My wife, an actor, needed a better way to rehearse and record lines for auditions without depending on my availability as her “scene partner.” With two young kids, it’s nearly impossible to narrow down quiet time with both of us to record scenes without some extra help.
We had settled on creating separate recordings for a while, combining them in post, but it never felt very natural. That’s what inspired me to try building a teleprompter web-based solution.
So I set out to build: ActSolo.AI.
ActSolo.AI is a web-based AI teleprompter where actors will be able to copy/paste scripts, assign AI voices (from ElevenLabs) that would actively read their ‘scenes’ with them with emotive control. Yes, AI is coming even for scene readers.
I started planning:
Wrote a product requirements doc (PRD – outlines requirements, costs, features, etc.)
Defined risks: voice tech, script logic, session reliability, devices, legal stuff
Built a work breakdown structure (WBS – breakdown of project scope)
Uploaded all of my specific details to ChatGPT as my co-developer since I already had a subscription plan and then to Loveable. The product was coming together
Tech stack:
Loveable
Supabase for backend logs and data
Integrated with ElevenLabs API for the high quality of their voices and for the Text-to-Speech capabilities
Here is a screenshot of the rehearsal page with the voices from ElevenLabs scroll open in order to view and select a variety of different voices to rehearse from. Actors can actively edit their scrip while rehearsing, view the screen in full, zoom in/out, control speeds, and assign lines for the AI to read by bolding or using italics.
The Journey
Supabase handled everything well and it’s been amazing to review the logs when troubleshooting. The backend was almost never the source of my biggest problems.
Instead, I realized most recently, it’s the browser’s native Web Speech limitations, across Safari, Chrome, and Firefox.
Throughout the project, I would make little strides here and there, feeling waves of happiness, then run into errors, where the app wasn’t listening like I needed it to, or it wouldn’t read the script text.
Unexpectedly, I spent about 90% of the time:
Reviewing code
Setting log creation and reviewing logs
Debugging filters
Trying to realize why the AI wouldn’t stop speaking. Why it did start speaking…
Testing into browser dev tools
At times I was pulling my hair out…but it’s been an awesome experience to build and own the product with these ‘vibe-coding’ tools.
I spent a week refactoring, building smarter state machines to make the rehearsal flow more predictable, maintainable, and extensible while reducing bugs, but I should’ve done this sooner.
I chased edge cases and learned a lot with ChatGPT as my co-developer, note taker, and prompt refiner.
So, nothing on the backend, no major issues with ElevenLabs—just browser quirks and limitations that I didn’t realize until recently, which finally made me realize that I'm going to need to change things up.
Key lessons:
Just build it, and test: I still saved time, money, sanity, and learned a bunch building with Loveable. I can't wait to try building something new
Of course, put the user first: If native tools don’t cut it, it’s smarter to pivot
Platform limits: No amount of clever prompting and code with client-side solutions will make browser voice limitations work like I want (yet). This project will need server-side processing to handle heavy lifting.
Logs are a real resource: I was able to find where the issues really were
Chat management: At first, I likely burned through credits too quickly but then I became more strict and found a good prompt rhythm with GPT before going back to Loveable with changes.
What’s Next
I’m working on a way to move the product into a cloud-based Text-to-Speech (TTS) with voice recognition stack, with the goal of a more reliable, actor-friendly, cross-device rehearsal experience, while attempting to control costs.
Currently, I’m looking into build ontop of what I already created with Loveable like the UI for script uploaded, the text formatting logic for marking cues (bold, italic, etc.), control of playback, and the basic flow controls should all be transferable.
So far, it looks like integrating with options like FlutterFlow or Bubble.io and OpenAI’s Whisper API should do the trick.
Then, we will eventually make this a produt offering under my wife’s business, Cameraon Creative.
If you’re reading this and have any feedback. Please reach out!