Zeno's Ziggurat


RPG characters with AI image creation

I claim no ownership or copyright of these images whatsoever. You may download and use them for whatever purpose you wish.


AI Concepts: The Art of Combat

I wrote an article early on with my first attempts at action scenes. Since then I’ve learned a lot about how to better coax what I want out of the AI, so here’s some tips, tricks, and lessons learned.

Style Matters

So one of things to always keep in mind when working with AI Art generators like Bing/DALL-E, is exactly how they work. They don’t really “understand” concepts. They just learn to associate certain kinds of imagery with certain terms. So really what you are doing when you give it your description is essentially narrowing down the search space of source images it uses. For purposes of combat, that comes into play first off in style. If you use a style like “Frank Frazetta style” then a lot of the source imagery has violent scenes in it (Frank Frazetta, among other things, did a lot of Conan imagery) and you get a much wider variety of more explicit violence than you get using the same prompt from a more generic style. For example:

This is a prompt using my normal style tags. “dark fantasy,d&d,oil paint, drawing;large barbarian fighting orc” It just generically describes a fight, and you largely get two figures standing facing each other like they’re thinking about maybe fighting sometime.

Now try this one “Frank Frazetta painting style;large barbarian fighting orc”. Same exact action description, just a different style tag. Suddenly things got a lot more visceral.

Of course, changing style can change the basic look of things a lot as well. But I’ve rarely had trouble getting good, recognizable renders of characters like Keira and Kord out of multiple styles. Note that this same issue applied to sex appeal. Using a style like “Luis Royo” and its going to give you a sexier look out of the same prompt than it would with a more generic style. Use a style like “Borris Vallejo”, and 75% of your images will set off the filters even if your prompt is simply “woman lying on a hill”.

Some other good styles for fantasy combat include the below – all well known for fantasy art, and many of whom explicitly did a lot of work illustrating d&d products.

Clyde Caldwell, Michael Whelan, Keith Parkinson, Julie Bell, Erol Otus, Luis Royo, Gerald Brom, Alan Lee

Larry Elmore can also work, but I find his style comes out of the AI a little more… sunny and cartoony than I like. Good for sunsets over the hilltops and campfire chats, not always as good for melee. But your mileage may vary.

You can generally test it out by doing just the kind of experiment I did above. Just look for an artist with pictures similar to what you want, and try it out. If it doesn’t know the artist, you’ll get something pretty generic. If it gives you hands on violence out of something simple like “fighting”, then its far more likely to give you what you want when you ask for something more explicit. Note even if you change the “style” – “Clyde Caldwell photograph” instead of “Clyde Caldwell painting” – it influences the content quite a bit.

The Content Filter(s)

The second issue you will always run into, is the content filter. Note that there are actually two filters. They appear to be completely independent, and appear to work completely differently.

  1. Bing appears to me to initially scan to see if it objects to the prompt itself. So there are a few words that seem to always set it off, but even those are “in context” rather than absolute. It still appears to be a learning AI filter – just an AI text parser instead of an image parser. Note that when you get the popup about “Bing has noted objectionable content” – that takes you back to the main window and clears your prompt – that’s usually a sign of the first filter in action.
    • Attacking
    • Bleeding
    • Stabbing
    • Lunging
    • Impaling
  2. The other is essentially an AI looking at the final product(s) and deciding if the image is objectionable before showing it to you. When you run your prompt and you only get 1-2 images instead of the typical 4, it is because the image filter objected to some of them.
  3. Note that these all obviously interact as well. If you use “Frank Frazetta style” and push it hard towards violence you are more likely to get things that cause Filter 2 to object even though Filter 2 doesn’t care. Similarly if you use “Disney Princess Movie” as a style – you will likely get Filter 1 objecting to things that would normally produce relatively innocuous imagery from that style.

So if I use a prompt like below, it will rarely get through. Maybe one out of every 2-3 tries one image is produced for me:

“Frank Frazetta style painting;large barbarian attacking an orc”

But if I use a prompt like this instead – which seems to human eyes to be more violent and explicit – it sails on through easily. Because it doesn’t upset the first filter:

“Frank Frazetta style painting;large barbarian swinging his axe at an orc”

Similarly “Frank Frazetta style painting;large barbarian swinging his axe at an (orc, bleeding)” will almost never get through. But instead do “Frank Frazetta style painting;large barbarian swinging his axe at an (orc,cuts and bruises)” And you start to see injuries.

Also note, the second filter in particular is *highly* dependent upon the random element in the images that get produces every time. If you run almost *any* prompt enough times it will object to some of the images eventually. So before you give up when it objects, try a few more times.

Tell the AI what you want

Lets go back to the prompt that gave us two dudes in a staring contest at the beginning. “dark fantasy,d&d,oil paint,drawing;large barbarian fighting orc”. And now lets spice it up a little. Note here that while what I want is melee combat, I’m describing things with punches and kicks. This is because the AI seems to object far less to punches and kicks than it does to slashing and stabbing.

“fantasy,d&d,oil paint,drawing;(large barbarian,axe) leaping to punch (orc,club,falling down)”

Now use some of the other lessons, lets bring back Frank:

“Frank Frazetta style painting;(large barbarian,axe) leaping to punch (orc,club,falling down)”

And now really try to sell it as a finishing blow:

Frank Frazetta style painting;(large barbarian,axe) leaping to punch (orc,club,falling down,exhausted,cuts,hurt)

Note that Frank Frazetta can at times be *too* violent – causing it to set off filter 2 when combined with too many of the other techniques mentioned here.

Action Words

Some fighting words work much better than others as noted in the previous section – particularly with the filters. “Punching” and “Kicking” often work. But motion words are important as well. If I use a simple description like this, I get a relatively tame scene.

“action scene;in motion,(german knight) raising shield to block (large orc,swinging sword);in the hills;fantasy,d&d,oil paint,drawing”

But if we work in action words – like “leaping, dodging, running, charging” – we can really get things far more dynamic:

action scene;in motion,(german knight,sword,ducking) raises shield as (large orc barbarian,hide armor,axe held high) leaps towards him;in the hills;fantasy,d&d,oil paint,drawing

Order and Chaos

Another aspect of constructing complex prompts like this is understanding that order matters. A *lot*. And this has a lot of ramifications. For example, if you have multiple characters, list the character who’s appearance is harder to get right (or the one you care about more) first. Second, you always want to start with a basic description of the action before you get into complex character descriptions. A basic template I use looks something like:

(perspective and image type):(short description of protagonist) (some action) to (basic description of antagonist) (some other action): 1)(detailed description of protagonist); 2)(detailed description of antagonist);(setting);(style)

Then, once I’ve pushed the button a few times to see what falls out, I may move a few things from the detailed descriptions to the initial basic description to emphasize them and make sure they come out. But remembering the more you move up front the less clear that initial “this is what I want” imperative gets. Going back to my Keira vs Bohdi fights, look at this description below:

“in motion,cinematic parkour fight scene;(half-elf woman,french,spinning in blurred motion,two swords,surrounded by wisps of shadow) slips behind and kicks (evil vampire woman,falling): 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,shocked);in dark elaborate egyptian tomb;dark fantasy,d&d,oil paint,drawing”

You can see that up front I’m telling it I want a moving fight scene. Then I describe the basics of what the two opponents are doing. Then I give details about those two, starting with the same tags as in the basic description – e.g. if the original was “half-elf woman”, the detailed is “half-elf woman,blah,blah,blah”. Then the setting, then the style. All depending upon what you want to emphasize the most.

Putting it all Together:

And now playing with that a little. And note: a lot of these prompts below get a 30-50% rejection rate from the AI. Just keep pushing the button and ignoring that until you get something you like.

“in motion;(half-elf woman,french,spinning in blurred motion,katana) spins and kicks (evil vampire woman,falling back): 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,shocked);in dark elaborate egyptian tomb;fantasy,d&d,oil paint,drawing”

Clyde Caldwell painting,in motion;(half-elf woman,french,spinning in blurred motion,katana) spins and kicks (evil vampire woman,falling back): 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,shocked);in dark elaborate egyptian tomb

“Frank Frazetta painting;in motion,view from the side;(half-elf woman,french,blurred motion,katana) slashes her katana under (evil vampire woman,leaping up high): 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,angry);in dark elaborate egyptian tomb”

Julie Bell painting;in motion,(half-elf woman,french,blurred motion,katana) ducks and parries with her katana (evil vampire woman,swinging sword): 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,angry);in dark elaborate egyptian tomb

Clyde Caldwell painting;in motion,view from the side;(half-elf woman,french,blurred motion,katana) spins and slashes her katana as (evil vampire woman,two daggers) leaps over it: 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,angry);in dark elaborate egyptian tomb

Michael Whelan painting;in motion;(half-elf woman,french,blurred motion,katana) dodges and blocks with her katana as (evil vampire woman,two daggers) leaps and slashes: 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,angry);in dark elaborate egyptian tomb

Keith Parkinson painting;in motion;(half-elf woman,french,blurred motion,katana) dodges and blocks with her katana as (evil vampire woman,two daggers) leaps and slashes: 1) (half-elf woman,french,short unkempt bob white hair,age 23,glowing purple eyes,black leather armor,small glowing tattoo on cheek); 2) (vampire woman,blue egyptian dress,long dark hair,glowing red eyes,blue skin,angry);in dark elaborate egyptian tomb

And, just for fun, Keira and Kord sparring:

“Keith Parkinson painting;in motion;(half-elf woman,french,blurred motion,spinning with wood staff) sparring with (large half-orc guard,swinging wood staff overhead): 1) (half-elf woman,french,short unkempt bob white hair,mischievous smile,age 24,purple eyes,black leather armor,small tattoo on cheek); 2) (large half-orc city guard,muscular,age 28,male,german,tall,grayish-green skin,goatee,black hair in neat topknot,heavy leather armor,big grin);in training hall”

Keith Parkinson painting;in motion;(half-elf woman,french,blurred motion,crouched,spinning with wood staff) sparring with (large half-orc guard,swinging wood staff overhead): 1) (half-elf woman,french,short unkempt bob white hair,mischievous smile,age 24,purple eyes,black leather armor,small tattoo on cheek); 2) (large half-orc city guard,muscular,age 28,male,german,tall,grayish-green skin,goatee,black hair in neat topknot,heavy leather armor,big grin);in training hall

Keith Parkinson painting;in motion;(half-elf woman,french,blurred motion,leaping high,long wood staff) sparring with (large half-orc guard,swinging long wood staff under her feet): 1) (half-elf woman,french,short unkempt bob white hair,mischievous smile,age 24,purple eyes,black leather armor,small tattoo on cheek); 2) (large half-orc city guard,muscular,age 28,male,german,tall,grayish-green skin,goatee,black hair in neat topknot,heavy leather armor,big grin);in training hall

Keith Parkinson painting;in motion;(half-elf woman,french,blurred motion,leaping high,long wood staff) flipping over the head of (large half-orc guard,swinging long wood staff): 1) (half-elf woman,french,short unkempt bob white hair,mischievous smile,age 24,purple eyes,black leather armor,small tattoo on cheek); 2) (large half-orc city guard,muscular,age 28,male,german,tall,grayish-green skin,goatee,black hair in neat topknot,heavy leather armor,big grin);in training hall

2 responses to “AI Concepts: The Art of Combat”

  1. Would it shock you to hear Larry Elmore is usually my favorite…

    Too funny. I particularly like Caldwell a lot too though. And these Parkinson inspired images look better to me than the real thing does! I often think of his work as too cold, or emotionless.

    I think I’ve only tripped the first filter once or twice? But that second one is a bear! It makes me mad sometimes. But I’ve noticed that often happens where I’m only getting a couple images on each cycle. I’ve had quite a few where I only get one or two images; and I look back over my prompt and think “that AI has a dirty mind…”

    This really looks like a post I’ll be using as reference, a lot of good stuff here.

    1. Wouldn’t shock me in the slightest. He’s great for some characters, not so much others. Its a style thing. Wouldn’t use him for Grim. Would absolutely use him for Lady Katarina and company. But it is also harder to get explicit violence out of his style with the AI than it is with some of the others. Just due to the nature of his work over time as used in the training sets.

Leave a Reply