This past week a video surfaced that’s causing quite the flutter. The video purports to show an asian  woman walking through a Tokyo street and is said to be entirely A.I. generated using text prompts. It’s impressive and certainly generations from the A.I. generated still images with too many fingers. I was impressed with the skin texture of the model’s face and noted that they cleverly work around possible uncanny valley issues by shielding the model’s eyes behind sunglasses. The only problem I see in the video is that when she’s walking her feet don’t always make “convincing contact” with the street, and there seems to be a few times that her walk cycle is off just a bit, being more influenced by the camera movement than what would be “natural.” 

Tokyo Walk created by OPENAI’s SORA

That this 59-second scene was created with text prompts is astonishing. Does this spell the end of graphics professionals who previously were employed crafting such street scenes for videos? Well, the industry is going to change. But that’s the nature of the business, going back to when some movie makers moved from New York to California over a century ago for the better weather and less corporate interference. But like LLMs and other graphics generators, this is another tool that eases the effort from idea to product. But like all of the video-shorts that I’ve seen proposing, “What would it be like if Wes Anderson directed ‘Name Your Movie Franchise HERE’?” If the person generating the idea doesn’t know why a Wes Anderson movie works, then the resulting product will be derivative crap. 

Thinking about the transition from analog to digital, the designers of the original Photoshop knew enough to approach image editing from the point of view of traditional photography and give users the kinds of tools or analog of tools that they were used to working with when editing photos in an analog dark room. The genius of Phil Tippett was designing animation controls that were based on what he was doing when he was producing “analog” stop-motion films. This is another generation of tools. In the hands of hacks, it’s going to be more meaningless crap, in the hands of a creative mind, astonishing beauty. But like desktop publishing and social media before it, we’ll have to wade though lots and lots of crap to find the handful of stories worth our time and attention. Turns out that infinite monkeys on infinite keyboards does not result in Shakespeare in the least. There needs to be a “there” there in the beginning for any of this to matter.

blogging - we're going to need more monkeys
blogging – we’re going to need more monkeys

Sources: