Expectations of using AI in Children's Picture Books
I've written before about the use of AI in relation to producing a children's picture book - the pros, the cons, the acceptance, the revulsion - so, I'm not going to touch on those issues again here. And this won't be a discussion on the philosophical or societal implications of using AI. Not in this post, anyway. This post is simply meant as a way to temper any exuberant expectations you may have if you're thinking of creating your own AI-aided children's picture book. So, with that out of the way, let's go!
Consistently Designed Characters
No matter how good the AI is or how well you write your prompts, you probably won't get consistent looking characters throughout. This is one of the main pain-points for creators using AI when trying to tell stories around recurring characters. The tech right now is good, but most tools require a certain level of technical proficiency and AI know-how to get consistent looking characters. Even at that, you can often see variations in characters from page-to-page or panel-to-panel. So, employ "KISS."
Keep It Simple, Stupid
Keep It Simple, Stupid (KISS) is an old saying birthed from the engineering world, and heavily used and promoted in the technology and programming world (the world I came from). And it's applicable to AI storytelling at this moment in time. Instead of using AI for a character-driven story, come up with an idea that'll work with the current state of AI generated art.
Practice. Practice. Practice.
After coming up with a workable theme, the next step is to practice. It is easy, yes, but it takes a little time to get the hang of it. Don't expect your first prompt to match that visual you've had stuck in your head for the last five years. You have to read and understand each tool's models and how best to craft your prompts to get the results you want.
Some Results Come Quick. Some Don't. Some Not at All.
So now that you have a theme and put some effort into learning and practicing your AI-generating skills, it's smooth sailing to a finished children's picture book, right? Well, probably not.
Having done three books now and spent time working on three others that didn't make the cut, I'm often surprised at what works and what doesn't. I've crafted many prompts I thought would generate perfectly on the first gen, only to abandon them after many unsuccessful subsequent gens. And then there are those prompts I thought wouldn't work but ended up working so well, minimal or no modifications were needed. Let's look at some examples.
It's a Beaver's Tale
One of the animals I was going to include in "If Animals Had Jobs 3" was a beaver (who actually replaced another animal from my original set that didn't sit with me after I actually finished the book. Anyway...). Since the series is aimed at a young audience, I went with a beaver with its well-known attributes and industrious habits. The prompt went something like:
"...a beaver working as a lumberjack surrounded by cut trees."
The original gen is the first image. Hmm...not bad. I modified this image by changing the backgrounds with additional generations. In the end, however, the feel of the image felt out of place with the rest of the book's illustrations. So, back to the drawing board and a few more generations with a modified prompt.
Nice images, and getting closer, but the images looked too much like they came right out of some nature magazine. So, I gave our industrious rodent some lumberjack clothes, changed up his style, and placed him in a few different backgrounds.
All fine, but still not what I wanted. I didn't think I was going to get what I was going for, so it was time for Mr. Beaver to retire from his lumberjacking ways. Perhaps we'll pull him out of retirement at some point in the future.
Jumping for Joy
Here's one image that took a single prompt, no additional variations, and no editing. We have a Goliath frog performing his duties as a singer in a heavy metal band. All hail heavy metal! From book 3:
"A Goliath frog could be a heavy metal singer."
Writing is a Lonely Job. Especially for a Dragon.
Some images require multiple generations from multiple prompts, several different prompts, variations of all the prompts, and additional post-gen editing. Here's one such example. From book 3:
"A Komodo dragon could be a fantasy writer."
I liked the last variation and went with that. Some of the lizards didn't look like Komodos and you can see where the AI was going with the second image, which is interesting.
Now I needed a way to convey that the Komodo dragon was a fantasy writer. Initially, the thought was to show him, well, deep in thought about a typical looking dragon. A few prompts later, I had a dragon that fit the bill. Now, the background of the Komodo dragon had to be removed, a thought balloon generated and added, the fantasy dragon re-generated and extended to fit the balloon, and then all mashed together to get the final result.
Hmm...the concept and look of Mr. Komodo worked for me, but not the end result. Back to the generating board! This time, instead of our scribe thinking about a dragon, I'll adorn his writing cave with pictures of dragons. First, I needed to generate images of dragons in picture frames.
These weren't bad, but even when extending the generation, the frames weren't quite right. And I didn't like the looks of the dragons. So, I generated several empty picture frames and additional dragons to go with them.
Putting it all together, we get the finished page that's in the book.
Note: There were many more generations and images before landing on the final result than what I'm showing here.
AI is here to stay. I find it incredibly useful, not only for supplementing certain stories, but in my everyday workflow. Give it a shot! But temper those expectations!