Clicky

Generative AI Videos – How to make them work

Generative AI is amazing, but it’s not perfect…yet.  When producing our first AI-generated commercial, we realized that old-school human effort is still required.  To create four clips that we ended up using, we had to generate ten.  And even those four were not perfect so we had to get creative in adjusting them to make it work.

Overall, a fun process, but more trial and error than we expected, which ultimately becomes a budgeting issue.

Based on our experience from this first video, here are tips when creating your GenAI video.

It all begins with a good prompt

We initially tried generating or enhancing the prompts with Google Gemini, Adobe Firefly and ChatGPT.  The results were mixed.  The prompts were actually very beautiful and detailed, but it didn’t quite capture what we were looking for.  AI prompts often added extraneous directions that confused the machine.

The prompt needs to set the scene, be clear about the characters, and focus on key actions rather than minute movements.  We also had to compromise on a few things after too many incorrect generations.

To demonstrate, here’s a reel with most of the generations from our Primo’s Donuts spot.

And here are the prompts for each of the scenes with our notes on how we adjusted:

Scene 1:

Show a Hispanic man wearing an orange reflective vest and a white hard hat.  Behind him is a large construction project.  He holds a chocolate donut, looks straight at the camera, and in a Hispanic accent, says, “I’m here for the donuts”.  He takes a bite of the donut.

Scene 2:

Medium shot of an elderly Asian woman selling fruit at a busy farmer’s market.  She is holding a donut in one hand and a bag of fruit in the other hand.  She hands the bag of fruit to a customer, then turns to the camera and in a thick Asian accent, says, “I’m here for the donuts”.

Scene 3:

Medium shot of a Caucasian woman in a business suit stands at an intersection in a busy city street.  She holds a donut in one hand and a coffee in the other hand.  She turns to the camera, holds up the donut, and in an Eastern-European accent, says, “I’m here for the donuts”.  The signal light turns green and she crosses the street.

Scene 4:

Show a black man at a beach resort singing a love song in front of a crowd of people.  He’s holding a donut.  He stops singing, looks at the camera, smiles and says in an African accent, “I’m here for the donuts”.  He resumes singing the love song.

We used Google VEO 3 for these generations.  First of all, it did not handle accents very well.  For Scenes 1, 3 and 4, it did not generate an accent at all.  It did fairly well on Scene 2, but the woman was looking away from the camera, and when we re-generated, we got a clip with no sound at all.  Ultimately, we decided not to use accents, and in retrospect, it was a good decision content-wise as well.

For Scene 4, the performer was supposed to stop singing and say his line addressing the camera.  In all of our generations; however, it had him singing the line, which we actually thought was pretty cool and kept it.

Some clips also added captions when it wasn’t prompted.

Creative editing is needed for GenAI videos

As the reel shows, there were some strange, and even a bit eerie, actions from our non-human humans: guys chewing before taking a bite, merchant dropping her bag of fruit, and several of our characters with creepy smiles and stares.

We could have definitely kept generating footage but because of limited budget, we had to make it work with what we had.

You can edit Google VEO 3 footage directly on Flow, but we used Adobe Premiere instead.

Google VEO 3 vs Adobe Firefly

We get 4000 AI credits every month with our Adobe Creative Cloud subscription.  So we initially tried to create videos with Adobe Firefly.  At this time, Firefly does not support speech and dialogue so our first concepts for the donut spot had no “synchronized” sound.  Which would have been ok had the Firefly generations been acceptable, but here’s the best we could do with Firefly.

As you can see, a total bust.  That’s when we turned to Google VEO 3 and it was night and day.  We still use Adobe Firefly for some scenes.  For example, the title screen with the floating donuts was Firefly-generated, but VEO 3 is definitely the go-to tool for our AI videos.

Generative AI technology is developing at a rapid pace.  So all these tips above could be moot in just a few months; but as of now, we’re learning and trying to keep up with the changing world of video.

Check out our Picturelab AI page for more details on our Generative AI video services.

 

 

 

 

Scroll to Top