Found 40 results for “Transcription”
One AI is a language AI capability platform that supports integrating text, audio, and video processing capabilities into products, and can be used for analysis, transcription, summarization, and building customized language processing functions.
Noty.ai is an AI meeting assistant that supports real-time meeting transcription, summary generation, action item extraction, and follow-up tracking, and can connect with commonly used tools such as Google Meet, Zoom, Gmail, Calendar, and Docs.
Translate.video is an AI translation tool for video content, supporting video translation, subtitle translation, dubbing, AI voice conversion, recording, and text generation to help distribute video content in multiple languages.
NetEase Jianwai Workbench is an AI tool for office and collaboration scenarios, providing video and livestream transcription, speech-to-text, document translation, and other functions, suitable for teams handling multimedia and text content.
AI Photo is an AI image generation tool mainly used to generate image content based on user needs, suitable for creative visual production and everyday image generation scenarios.
Glasp YouTube Summarizer is a tool for summarizing YouTube video content. By combining a Chrome extension with GPT capabilities, it helps users quickly extract key points from videos and obtain learning information more efficiently.
AnthemScore is an automatic music transcription software that uses AI to convert MP3 and WAV audio files into sheet music or guitar tablature, and supports Windows, macOS, and Linux.
Transkribieren is an audio transcription tool that supports uploading multiple audio formats, provides a relatively convenient speech-to-text service, and extends to use on mobile, in the browser, and in meeting scenarios.
Visla's AI video generator is a video tool for presentation and narrative content creation. It can quickly generate and edit videos with AI, and provides features such as transcription and asset suggestions to help users achieve professional expression.
Nuro.video is an AI video editing tool that can automatically transcribe, analyze, and organize long raw video footage into finished videos with titles, transitions, and animations.
MixPeek is an intelligent search layer built on top of object storage. Through APIs, it can extract, index, and perform natural language search on non-text files, helping applications quickly gain search-engine-like file retrieval capabilities.
A tool for quickly creating AI chatbots that can turn content such as websites, e-books, podcasts, videos, and YouTube playlists into conversational knowledge assistants, and offers different hosting plans.
Podsqueeze is an AI content repurposing tool for podcast creators that can generate supporting content such as show descriptions, timestamps, and newsletters around podcast audio, helping improve post-production organization efficiency.
Spoke.ai is an AI assistant for meeting scenarios, offering automatic note-taking, real-time agenda tracking, meeting record generation, and video highlight extraction to help organize meeting information and decision content.
Krisp is an AI-based noise cancellation app mainly used to improve the quality of online meetings and voice communication. It supports Mac and Windows, and offers voice productivity-related features and a free version.
Moises App is an AI music tool that supports adjusting song key and speed, separating vocals and instruments, and provides mastering and audio extraction features.
Otter AI is a meeting recording and note-taking tool that supports real-time speech transcription, audio recording, slide capture, and automatic meeting summary generation for easier organization and review.
Notability.ai is an organization tool that combines Notion notes with Telegram. Users can send content through Telegram and automatically complete the categorization and organization of Notion notes.
Supertranslate is a video subtitle tool that can automatically transcribe videos in more than 100 languages and generate English subtitles, suitable for creators and teams that need to distribute content across languages.
Good Tape is an automatic speech-to-text tool that can quickly convert audio recordings into text, supports more than 90 languages, and is suitable for organizing interviews, meetings, and dictated content.
Oxolo.com is a video generation tool for e-commerce businesses that can extract selling points from product links and automatically generate video scripts for creating product videos, ads, and social media content.
User Evaluation is a website that provides artificial intelligence tools for research, including interview summaries and trend analysis reports. The website allows users to sign up for free and offers functions for UI and UX research, product design, and product management.
Taption is a video and audio transcription tool that supports automatic subtitle generation and translation, covers more than 40 languages, and provides built-in editing features for organizing transcription content.
Koolio.ai is a web-based AI podcast creation and collaboration tool that helps users quickly complete podcast content from ideation to editing. It supports audio transcription, collaborative editing, automatic matching of sound effects or music, and common audio processing operations.
Jamie is an AI meeting assistant that can quickly generate business-style meeting summaries from meeting audio, helping users organize discussion highlights and improve post-meeting documentation efficiency.
Suki is an AI voice assistant for doctors, used to reduce documentation and administrative burdens so clinical staff can devote more time to patient care.
MeetGeek is an AI meeting assistant that can automatically record, transcribe, summarize, and share meeting highlights. It supports keyword search, custom prompts, and integrations with various workplace tools, making it suitable for improving meeting documentation and collaboration efficiency.
AI models for transcribing and understanding speech
iFLYTEK Meeting is an intelligent video conferencing software launched by iFLYTEK, featuring high definition, low latency, and multi-party collaboration. It supports screen sharing, real-time multilingual subtitles, automatic meeting record generation, and AI noise reduction technology, providing a high-definition and stable audio and video experience. Users can join from multiple terminals such as PC, mobile phone, and smart screens, enjoying convenient remote collaboration and meeting experiences.
Feishu Minutes offers intelligent meeting notes and fast AI speech-to-text transcription.
Maier Meeting Notes is an application software under AISpeech that integrates real-time speech transcription, real-time translation, AI summary analysis, and other functions, mainly used in scenarios such as office meetings, students' online classes, and customer interview recordings. The software supports recording and transcription at the same time, and after recording ends, the audio and text are synchronized in real time to the PC and mobile sides.
Cohere is a platform that provides large language models, helping developers and enterprises build high-performance AI products. The platform mainly offers AI-powered search text services (multilingual embeddings, neural search, search ranking), text classification, and text generation, helping enterprises quickly deploy conversational AI chatbots, generative search engines, text summarization, and enhanced vector retrieval.
Yun Yiduo is an intelligent cloud storage assistant newly launched by Baidu Netdisk that can converse with users, aiming to provide the most convenient cloud storage experience and improve users' efficiency in life, study, and work.
ElevenLabs is an AI text-to-speech platform that provides realistic voice synthesis solutions for developers, creators, and enterprises. Its core products include text-to-speech (supporting 29+ languages including Chinese and 10,000+ voices), AI dubbing, voice cloning, music generation, and more.
Deepgram is a platform that provides advanced AI speech recognition and natural language processing technology. Its core products are powerful Speech-to-Text (STT) and Text-to-Speech (TTS) APIs, enabling developers to quickly integrate voice transcription and understanding capabilities into their own applications and services.
Spikes Studio is an AI-powered video auto-editing tool that can automatically analyze and summarize long videos, extract key clips, and generate multiple short videos, aiming to simplify the editing workflow for video content creators. It is especially suitable for fast-paced social media platforms.
Tingnao AI is an AI-powered intelligent voice assistant focused on speech-to-text and real-time recording summaries, offering audio/video transcription, real-time recording-to-text, AI summaries, chapter overview, and other features. Users can freely drag text to view audio/video progress and enjoy a convenient intelligent recording experience.
AiPy is a free and open-source AI agent factory, a local version of Manus, built on large language models (LLMs) and Python capabilities. It supports local deployment to ensure data privacy and security. Through the "Python-Use" paradigm, AiPy gives AI "hands," enabling it to analyze local data, operate local applications, and execute complex tasks such as controlling phones, generating multi-voice speech, analyzing medical test reports, extracting speech from videos, and sending scheduled emails.
AnyGen is an AI office agent launched by ByteDance that improves office efficiency through voice input and AI technology. Users can press and hold the recording button to quickly convert speech into text, with support for adding photos, screenshots, and links, avoiding the tedious organization required after traditional note-taking.
iFLYTEK Simultaneous Interpretation is a professional AI simultaneous interpretation product launched by iFLYTEK. Based on its world-leading intelligent speech and language technologies, it provides integrated simultaneous interpretation services for multiple scenarios and languages, including real-time transcription and translation, simultaneous interpretation, live subtitle display, and meeting record sharing.
