AI-powered media summarization platform that converts YouTube videos, uploaded files, and audio into concise summaries using OpenAI GPT and Whisper, with a contextual chatbot and Stripe subscriptions.
Role
Youtella was built for a US-based client who had a clear problem to solve: too much video content, not enough time to consume it. The platform allows users to submit a YouTube link or upload a video or audio file, and receive an AI-generated summary in seconds. A built-in chatbot then lets users ask specific questions about the content they submitted, pulling answers from the generated summary.
The client brought the core idea and relied entirely on Cenciss for technical planning, feature roadmap, architecture decisions, and execution. The business model required a subscription layer with automated monthly billing and proper cancellation handling through Stripe webhooks.
Cenciss led the full engagement from product definition through MVP launch.
Media processing was the technical core of the platform. Uploaded video files had to be converted to audio before transcription, and the pipeline had to handle that conversion reliably across different file formats and sizes without blocking the user experience. Any failure in the processing chain would break the product's primary value proposition.
Transcription accuracy mattered as much as speed. Whisper was the right tool for uploaded media, but integrating it correctly for variable-length content required careful handling of the audio input and output pipeline. For YouTube videos, the platform used RapidAPI to extract captions, which introduced a dependency on third-party reliability for publicly available content.
The GPT summarization layer had to produce summaries that were genuinely useful, concise but substantive, not just truncated transcripts. Prompt engineering for different content types, technical tutorials, interviews, lectures, required iteration to produce consistently useful output.
The chatbot added a second AI integration requirement. It had to answer questions in context: not general knowledge queries, but specific questions about the content the user had just submitted. The implementation needed to pass the right context to the model efficiently without hitting token limits on long summaries.
Stripe subscription management required webhook configuration for three critical events: successful payment, monthly renewal, and cancellation. Each webhook needed to update the user's subscription status correctly in the database, and any failure in webhook processing would affect the user's platform access.
The platform was built on the MERN stack: React.js on the front end, Node.js and Express.js on the back end, and MongoDB for user data, summaries, subscription records, and interaction logs.
The media processing pipeline was built server-side in Node.js. Uploaded video files were converted to audio before being passed to OpenAI Whisper for transcription. YouTube links triggered a RapidAPI call to fetch available captions. Both paths converged on the same GPT summarization step, keeping the downstream logic consistent regardless of input type.
Summarization prompts were engineered to produce structured, readable output calibrated to the content type. The chatbot was implemented by passing the generated summary as context with each user query, giving the GPT model the information it needed to answer specifically about the submitted content rather than drawing on general training data.
Stripe was integrated with webhook endpoints configured for payment success, renewal, and cancellation events. Each webhook updated the user's subscription status in MongoDB, controlling access to the platform's processing features based on active subscription state.
The UI was designed for simplicity: paste a link or upload a file, wait for the summary, read or ask questions. The front end was built in React.js with state management covering the asynchronous processing flow and real-time chatbot interaction.
Youtella launched as a functional MVP with all core features operational: YouTube caption extraction, uploaded media transcription via Whisper, GPT-powered summarization, contextual chatbot responses, and automated Stripe subscription billing.
Users could submit a video and receive a usable summary in seconds rather than watching hours of content, with the chatbot providing a second layer of value for users who wanted to interrogate specific parts of the source material.
The subscription system handled billing and cancellation events through Stripe webhooks reliably across both sandbox testing and production verification. The client received a complete, scalable AI-powered product built from a single founding idea into a production-ready platform.
Tech Stack
Interested in a project like this?
Let's discuss your goals and map out the right solution.
Get in touchProject Details
Custom ecommerce platform built from scratch for a USA-based wholesale clothing brand. First online presence, global inventory booking, and a 15% increase in sales.
Full-stack hiring marketplace built for a Canadian client, featuring mandatory video resumes that let employers assess candidate personality and communication skills before the first interview.
Full redesign and rebuild for a Canadian accounting firm specializing in trucking taxation. New online forms, custom admin panel, persuasive copy, and a 30% increase in website traffic.