OpenAI has recently announced its latest innovation, Sora, a text-to-video modela text-to-video model designed to understand and simulate the physical world in motion.
Sora is a milestone in AI development, capable of generating videos up to a minute long that adhere closely to user prompts while maintaining high visual quality.
This new model can produce videos lasting up to a minute, closely adhering to the user鈥檚 instructions while maintaining high visual fidelity.
鈥淲e鈥檙e teaching AI to understand and simulate the physical world in motion,鈥 OpenAI states, which shows Sora鈥檚 possible plans to completely change real-world interaction through AI.
Sam Altman, CEO of OpenAI, has been actively showcasing Sora鈥檚 capabilities, inviting users to submit video captions to demonstrate the model鈥檚 ability to generate complex scenes with accurate details.
Sora employs a diffusion model approach, starting with static-like noise and progressively refining it into a coherent video.
This method allows for the creation of entire videos at once or the extension of existing ones, ensuring consistent subjects even when they temporarily leave the view.
The model鈥檚 architecture, inspired by transformers used in GPT models, enables scaling and dealing with various visual data types, including different durations, resolutions, and aspect ratios.
听
More from News
- From Workouts To Managing Jetlag: The British Tech Scale-Up That Just Hit One Million Users Globally Appoints New CEO
- Hackers Tricked Instagram鈥檚 AI To Leak Your Log In Details 鈥 How Can Users Stay Protected?
- New Research Reveals The UK鈥檚 Top 10 鈥淔uture-Ready鈥 Cities
- New Research Shows How Elections Are Impacting The Job Market 鈥 Here鈥檚 How
- Is London Becoming The World鈥檚 Next AI Capital?
- Google鈥檚 AI Can鈥檛 Even Spell 鈥淕oogle鈥 鈥 So Why Is It Replacing Search?
- Will AI Labels Actually Save YouTube From AI Slop?
- The Rise Of 鈥淣ew Brand鈥 Cybercrime Groups And The Business Of Ransomware
Sora鈥檚 Capabilities and Research
听
Sora has been built on the foundation laid by previous research in DALL路E and GPT modelsDALL路E and GPT models, incorporating the recaptioning technique from DALL路E 3 to generate descriptive captions for training data.
This allows Sora to produce videos that closely follow text instructions, animate still images, extend videos, and fill in missing frames with remarkable accuracy.
OpenAI鈥檚 objective to advancing this model is evident in its invitation to visual artists, designers, and filmmakers to provide feedback, aiming to refine Sora for creative professional use.
However, the model isn鈥檛 without its limitations. It struggles with simulating complex physics accurately and understanding specific cause-and-effect scenarios, such as leaving a mark on a cookie after a bite. OpenAI acknowledges these weaknesses and is actively working on improvements.
听
Prioritising Safety
听
OpenAI is taking serious safety measures before making Sora widely available. The model is undergoing adversarial testing by red teamers, domain experts in misinformation, hateful content, and biashateful content, and bias, to identify potential harms or risks.
OpenAI is also developing tools to detect misleading content and plans to include C2PA metadata in future deployments to enhance safety.
The existing safety methods developed for DALL路E 3, such as text and image classifiers to review and reject inappropriate content, will be applied to Sora.
OpenAI is engaging with policymakers, educators, and artists worldwide to understand concerns and identify positive use cases for this technology, which just further emphasises the importance of real-world learning in creating safe AI systems.
听
Sephora鈥檚 Future In AI
听
While Sora is currently available to a select group of red teamers and creative professionals for testing and feedback, there鈥檚 no definitive word on when it will be accessible to the broader public or the associated costs.