OpenAI Sora officially debuts, and the webpage is flooded

2024/12/10 08:29

Source: Machine Heart

On the third working day of OpenAI's 12-day update, the blockbuster release is finally here!

Just as everyone expected in the comment area before the live broadcast, the official version of the video generation model Sora finally appeared!

It has been nearly 10 months since the release of Sora on February 16 this year.

Now, netizens can finally experience Sora's powerful video generation capabilities!

At the same time, OpenAI has developed a new version of Sora, Sora Turbo, which is much faster than the preview model in February. Today, this version will be available as a standalone product to ChatGPT Plus and Pro users.

According to today's live broadcast, Sora users can generate videos with 1080p resolution, up to 20 seconds, widescreen, vertical or square. And users can use resources to expand, remix and merge, or generate new content based on text. OpenAI has developed a new interface that makes it easier to prompt Sora with text, images and videos, and the storyboard tool allows users to accurately specify the input of each frame.

We can start by looking at a few examples of generated videos:

Hint: The footage is foggy and the colors are contrasting, capturing the feel of a low-visibility shot, providing a sense of immediacy and chaos. This scene shows shaky footage from the perspective of a sailor on a 17th-century pirate ship. The horizon shakes violently as waves crash against the wooden hull, making it difficult to discern details. Suddenly, a giant sea monster appears out of the raging sea. Its huge, slippery tentacles reach out dangerously, and its slimy appendages wrap around the ship with terrifying force. The field of view changes dramatically as the sailors scramble to confront the terrifying sea creature. The atmosphere is very tense, and the groaning of the ship and the roar of the sea can be heard amid the chaos.

Hint: Rockefeller Center is full of golden retrievers! Everywhere you look, there are golden retrievers. It's a winter wonderland in New York at night, complete with a giant Christmas tree. Taxis and other New York elements can be seen in the background

Sam Altman said that what excites him most is the convenience of co-creating with others, which feels like a fun new thing. You can think of Sora as a video version of GPT-1.

OpenAI research scientist Noam Brown said that Sora is the most intuitive demonstration of the power of scale.

For the release of Sora, some netizens said that this is the best Christmas gift, and some said that Sora will be a game changer.

By text, pictures or videos,bring your imagination to life

With excitement, Synced also wants to try out Sora! Unfortunately, there are too many people who want to try it out, so I can’t log in:

Experience address: https://sora.com/onboarding

Let’s first show readers the capabilities of Sora as officially released.

Use Remix to replace, remove, or reimagine elements in your video

Open the door to the library

Replace the door with a French door

‍The scene outside the door is replaced with a lunar landscape

Re-cut: Find and isolate the best frames, then extend them in either direction to complete the scene

Storyboard: Organize and edit unique sequences of videos on a timeline

The scene for the first 114 frames of the video is "a vast red landscape with a spaceship docked in the distance."

Next, the scene for frames 114-324 of the video can be transformed to: "Looking out from the inside of the spaceship, a space cowboy stands in the center of the frame."

Finally, the video content can be described as "a close-up of the astronaut's eyes, framed by a mask made of knitted fabric."

‍Loop: Use Loop to edit and create seamlessly repeating videos

Blend: Merge two videos into a seamless clip

Style presets: Use "Presets" to create and share styles that inspire your imagination

More amazing videos generated by Sora also require the imagination of netizens to create.

Sora official version system card

In February of this year, when Sora was first released, OpenAI published a technical report on Sora.

OpenAI believes that extending the video generation model is a promising way to build a universal simulator for the physical world.

Today, with the official release of Sora, OpenAI also released the Sora System Card. Interested developers can dig into the technical details.

Address: https://openai.com/index/sora-system-card/

Sora is OpenAI's video generation model, designed to take text, image, and video inputs and generate new videos as outputs. Users can create videos up to 1080p resolution (up to 20 seconds) in a variety of formats.

Sora is built on the basis of the DALL・E and GPT models, and aims to provide people with tools for creative expression.

Sora is a diffusion model that generates new videos starting from a base video that looks like static noise, gradually transforming it by removing the noise in multiple steps. By feeding the model multiple frames of predictions at once, Sora solves the challenging problem of ensuring that the subject of the picture remains unchanged even if it temporarily leaves the field of view. Similar to the GPT model, Sora uses a transformer architecture, which unleashes excellent scaling performance.

Sora uses the recaptioning technique from DALL・E 3, which involves generating highly descriptive captions for visual training data. As a result, Sora is able to more faithfully follow the user's text instructions in the generated video.

In addition to being able to generate videos based on text instructions alone, the model is also able to take existing static images and generate videos from them, accurately animating the image content and paying attention to details. The model can also take existing videos and extend them or fill in missing frames. Sora is the foundation for models that can understand and simulate the real world, and OpenAI believes that Sora will be an important milestone on the road to AGI.

On the data side, as OpenAI described in a February technical report, Sora draws inspiration from large language models, which acquire generalist capabilities by training on internet-scale data. LLMs set a new paradigm in part by innovating in their use of tokens. The researchers cleverly unified the multiple modalities of text — code, math, and various natural languages.

In Sora, OpenAI considered how models that generate visual data could inherit the benefits of this approach. Large language models have text tokens, while Sora has visual patches. Prior research has shown that patches are effective representations for models of visual data. OpenAI finds patches to be scalable and effective representations for models trained to generate various types of videos and images.

At a high level, OpenAI converts videos to patches by first compressing them into a lower-dimensional latent space and then decomposing the representation into spatiotemporal patches.

Sora has been trained on a variety of datasets, including public data, proprietary data obtained through partners, and custom datasets developed in-house:

Publicly available data. This data is mainly collected from industry-standard machine learning datasets and web crawlers.
Proprietary data from data partners. OpenAI establishes partnerships to obtain non-public data. For example, it cooperates with Shutterstock⁠ Pond5 to build and provide AI-generated images. OpenAI also commissions the creation of datasets that suit its needs.
Artificial data. Feedback from AI trainers, red teamers, and employees.

For more details, readers can check out the system card introduction.

Price Benefits

Of course, with the official release of Sora, OpenAI also announced the usage price. It seems that it is not cheap:

The video generation benefits that ChatGPT Plus users who pay $20 a month can enjoy include:

Up to 50 priority videos (1000 points)
Resolution up to 720p, duration 5 seconds

The video generation benefits that ChatGPT Pro users who pay $200 a month can enjoy include:

Up to 500 priority videos (10000 points)
Unlimited relaxed videos
Resolution up to 1080p, duration 20 seconds, can generate 5 concurrent videos
Download without watermark

After all, we have been looking forward to it for so long. Are you going to go for it?

Gain a broader understanding of the crypto industry through informative reports, and engage in in-depth discussions with other like-minded authors and readers. You are welcome to join us in our growing Coinlive community:https://t.me/CoinliveSG

Add Comment

LoginLeave your comments

0 Comments

Earliest

Load more comments

More news about 如何合并 apex 帐户

Nov 20
Lens Protocol launches new social feature Graphs that maps account-to-account connections
Bullish
Bearish
Mar 01
Alchemy launches new "embedded accounts" feature
Bullish
Bearish
Jan 26
Algorand Foundation: Foundation CEO’s account has been compromised
Bullish
Bearish
Dec 30
Compound Labs: Social media accounts were compromised yesterday and control has been regained
Bullish
Bearish
Sep 19
Sui: There are more than 6 million active accounts on the network
Bullish
Bearish
Aug 30
The number of Sui active accounts exceeded 5 million
Bullish
Bearish
Aug 09
Arkham: Adding authentication for selected user accounts
Bullish
Bearish
Jul 13
Safe integrates ERC-4337 to enhance account abstraction
Bullish
Bearish
Oct 25
Payment giant PayPal has added keys as a way to log in to accounts
Bullish
Bearish
Aug 06
The Sandbox regains control of its Instagram account
Bullish
Bearish

OpenAI Sora officially debuts, and the webpage is flooded

By text, pictures or videos,bring your imagination to life

Sora official version system card

Price Benefits

More news about 如何合并 apex 帐户

More news about 如何合并 apex 帐户

Learn in E-minute: How retail investors can choose investment categories to achieve steady wealth appreciation

Doja Cat 的 Twitter 帐户被加密黑客入侵，开始大肆宣扬DOJA模因币，并与Iggy Azalea激情对线

How zkSync implements account abstraction in the protocol

Account Abstraction: From Chaos to Clarity A Practical Guide to Better User Experience

Bitcoin’s Endgame: It May Be Equal to the Total Value of All Human Surplus

知名NFT项目Azuki推特帐户被盗总计损失超75万美元

How will the merger affect Ethereum's inflation rate, gas fees, and energy consumption?

After the merger, how will the participants of the Ethereum supply chain capture value?

Tornado Cash Lianchuang GitHub Account Suspended, U.S. Treasury Department Blacklists Tornado Cash

[Link to FM] STEPN announced that it will clear accounts in mainland China, and the Terra governance system voted through the plan to destroy 1.3 billion UST tokens

OpenAI Sora officially debuts, and the webpage is flooded

By text, pictures or videos,bring your imagination to life

Sora official version system card

Price Benefits

More news about 如何合并 apex 帐户

More news about 如何合并 apex 帐户

Learn in E-minute: How retail investors can choose investment categories to achieve steady wealth appreciation

Doja Cat 的 Twitter 帐户被加密黑客入侵，开始大肆宣扬DOJA模因币，并与Iggy Azalea激情对线

How zkSync implements account abstraction in the protocol

Account Abstraction: From Chaos to Clarity A Practical Guide to Better User Experience

Bitcoin’s Endgame: It May Be Equal to the Total Value of All Human Surplus

知名NFT项目Azuki推特帐户被盗 总计损失超75万美元

How will the merger affect Ethereum&#39;s inflation rate, gas fees, and energy consumption?

After the merger, how will the participants of the Ethereum supply chain capture value?

Tornado Cash Lianchuang GitHub Account Suspended, U.S. Treasury Department Blacklists Tornado Cash

[Link to FM] STEPN announced that it will clear accounts in mainland China, and the Terra governance system voted through the plan to destroy 1.3 billion UST tokens

知名NFT项目Azuki推特帐户被盗总计损失超75万美元

How will the merger affect Ethereum's inflation rate, gas fees, and energy consumption?