Found 37 results for “Synthesis”
Translate.video is an AI translation tool for video content, supporting video translation, subtitle translation, dubbing, AI voice conversion, recording, and text generation to help distribute video content in multiple languages.
AI Voice Detector is an audio authenticity detection tool used to identify whether speech is generated by AI. Users can upload audio files for verification, making it suitable for scenarios involving evidence review, media judgment, and authenticity analysis in customer communications.
Illuminarty is a detection tool for identifying AI-generated or tampered content. It can analyze images, text, and suspected synthetic images, helping users determine whether content involves AI generation, editing, or Deepfake.
Synthesis YouTube is a knowledge retrieval tool for video podcast content. It lets users search by keyword across more than 1,000 hours of content and quickly locate relevant clips, helping them find topic-related information more efficiently.
AITWO.CO is an AI architecture and spatial design tool that supports generating concepts for multiple building types and allows custom visual parameters such as style, color, lighting, composition, and details.
HeyGen is an online AI video generation tool that supports creating talking avatar videos and provides customizable avatars and voiceover features, suitable for content production scenarios such as training, teaching, explanations, and marketing.
Nyx is a project showcasing AI-generated virtual images, covering categories such as food, animals, and landscapes. The image style is realistic, but the depicted objects do not actually exist.
Novels AI is a tool for generating personalized audio adventure stories, allowing users to customize characters and plot choices and experience AI-driven immersive story content in audiobook form.
NarrationBox is an AI voice generation tool that offers more than 700 AI narrator voices for creating audio content such as podcasts, audiobooks, and dubbing.
Revocalize AI is an AI voice synthesis tool that supports voice cloning, voice protection, and voice creation, offers multilingual voice options, and is suitable for audio content production and personalized voice applications.
Voiceful provides game character voice generation and speech synthesis demos, and supports integration into Unity via SDK, making it suitable for development and testing scenarios that require character voice capabilities.
Harmonai is a community-driven open-source generative audio project dedicated to providing AI tools for music creation, enabling more people to participate in sound and music generation practice.
Free Text to Speech Generator is an online TTS tool that supports multiple languages, multiple dialects, and mixed Chinese-English reading, and can convert text into speech and export MP3 files.
LALAL.AI is an audio separation tool that can extract vocals or multiple instrument tracks from songs, supports high-quality audio processing, and is suitable for music editing, practice, and asset creation.
JURA.Bio is an AI project focused on healthcare and cell therapy, positioned to explore future applications related to cell therapy, but there is currently limited public information and the functional details are still unclear.
Pollinations is an AI image generation tool that generates images from text prompts based on the Stable Diffusion model, and supports setting parameters such as frame count, random seed, and diffusion steps, making it suitable for creating artistic images and experimental visual content.
Google AI text-to-image generation model
IBM Watson Text to Speech
Xunfei Zhizuo is a one-stop AIGC content creation platform launched by iFLYTEK, providing services such as text-to-speech and virtual digital human video production based on artificial intelligence technology. Users can easily achieve rapid generation of audio and video content and create high-quality media works without professional skills.
Uberduck is an open-source community for AI voice generation and synthesis. The platform offers more than 5,000 voices to help users create AI dubbing and speech, and you can even use your own custom voice clone for synthesis.
Moyin Workshop is a professional AI voiceover tool with more than 800 voices and over 1,000 styles, meeting a wide range of needs from video dubbing to audiobooks. Moyin Workshop offers rich features, including speech rate adjustment, polyphonic character selection, and pause control, ensuring realistic and natural text-to-speech results. Users can easily download lossless audio files and enjoy a convenient voiceover experience.
Qimiaoyuan is an AI digital human short-video and livestreaming solution launched by Mobvoi. With this digital avatar creation and livestreaming platform, users can create their own digital avatars and conduct livestreaming activities through them. The Qimiaoyuan platform currently has more than 100 digital humans and more than 1,000 3D digital assets, providing users with a wide range of choices.
Xiangji Translation is an AI image and video translation tool launched by Xiangji Technology. Based on technologies such as text recognition, text translation, image/video restoration, and text rendering, it provides users with efficient and accurate image/video translation services. With AI support, Xiangji Translation can preserve the quality of original images and videos to the greatest extent possible and accurately translate text into your desired language.
ElevenLabs is an AI text-to-speech platform that provides realistic voice synthesis solutions for developers, creators, and enterprises. Its core products include text-to-speech (supporting 29+ languages including Chinese and 10,000+ voices), AI dubbing, voice cloning, music generation, and more.
Zidong Taichu is a multimodal large model jointly launched by the Institute of Automation, Chinese Academy of Sciences, and the Wuhan Institute of Artificial Intelligence. It is the upgraded 2.0 version built on the 100-billion-parameter multimodal large model “Zidong Taichu 1.0.” The Zidong Taichu large model supports comprehensive question-answering tasks such as multi-turn Q&A, text creation, image generation, 3D understanding, and signal analysis. It has strong cognitive, comprehension, and creative capabilities, and can deliver a brand-new interactive experience.
Deepgram is a platform that provides advanced AI speech recognition and natural language processing technology. Its core products are powerful Speech-to-Text (STT) and Text-to-Speech (TTS) APIs, enabling developers to quickly integrate voice transcription and understanding capabilities into their own applications and services.
Chanjing is an AI digital human short video and livestreaming platform launched under the marketing data analysis platform Chanmama. Through ultra-fast cloning technology and an efficient content production workflow, it enables users to quickly create and publish digital human short videos.
Xiling Digital Human is a digital human platform launched by Baidu AI Cloud based on artificial intelligence technology, providing enterprises and individual developers with high-performance, easy-to-integrate, and diverse digital human component capabilities. The platform supports digital human avatar customization, video synthesis, interactive dialogue, livestreaming, and other multi-scenario applications to meet the needs of different industries.
Xunfei Huijing (formerly Xinghuo Huijing) is an AI short-video creation platform launched by iFLYTEK that can automatically convert user-entered text descriptions into video content (such as short dramas, trailers, and MVs), including generating video scripts and storyboards, ultimately forming complete short videos.
Bairimeng AI is an AI video creation platform launched by Guangmo Technology. Through natural language processing technology, it allows users to input text content and quickly generate videos, with a maximum length of up to 6 minutes. The platform supports text-to-video, dynamic visuals, AI character generation, and other features, while maintaining consistency in characters and scenes.
LangLang Voiceover is an intelligent text-to-speech tool that provides voice synthesis services. It supports more than 30 languages, including Chinese, English, German, and French, as well as more than 10 emotional styles such as happy, sad, and excited. The platform is feature-rich and easy to use, supporting SSML tags to enable advanced functions such as polyphonic character handling and multi-speaker dubbing.
Abei Intelligence is a one-stop AI picture book creation platform designed for children's education. Through three simple steps—story creation, image generation, and intelligent voice-over—users can quickly create personalized picture books. Abei Intelligence encourages parent-child interaction, cultivates children's creativity, emotional expression, and language skills, while incorporating science, moral education, and physical activities to spark children's interest in technology and help them thrive in the intelligent era.
SoundView is an AI video localization tool that supports video dubbing and video translation. SoundView integrates multilingual translation, speech synthesis, speech recognition, and large-model technology to simplify and accelerate the creation of product marketing videos. SoundView supports dubbing and subtitle editing in 100 languages, increasing video production efficiency by 10 times and reducing video translation costs by 90%.
SiliconFlow is a generative AI computing infrastructure platform. SiliconFlow provides products including the SiliconLLM large model inference engine, the OneDiff high-performance text-to-image/video acceleration library, and the SiliconCloud model cloud service platform, reducing AI model deployment and inference costs while improving user experience.
JoyPix is an AI creation tool focused on digital humans and speech synthesis. Users can create personalized virtual avatars by uploading photos, with support for voice conversations with virtual avatars.
Wujie Future is an AI application and elastic computing network platform focused on providing users with strong computing power support and a wide range of AI application services. Wujie Future offers multiple types of GPU resources, allowing users to choose suitable resources based on their needs for AI application training and deployment.
Xmov Nebula is an embodied intelligent 3D digital human open platform launched by Xmov Technology, dedicated to upgrading AI from “having a brain” to “having a body” to enable natural expression and interaction. Based on text input, Xmov Nebula can generate a 3D digital human’s voice, expressions, and movements in real time, supporting multimodal generation, low-cost operation, low-latency interaction, and multi-terminal adaptation.
