Skip to content

The Patron Saint of Superheroes

Chris Gavaler Explores the Multiverse of Comics, Pop Culture, and Politics

Monthly Archives: January 2016

That title means both “words and pictures” and “words in pictures,” because both phrases describe comic books. Although not all comics include words, essentially all superhero comics do. (A near exception, the five-page “Young Miracleman” story in the back of Miracleman #6 includes two talk balloons each containing the transformation-triggering word “Miracleman!” and a range of sound effects, newspaper headlines, and signage.) How images and text work together is one of the most complex and distinctive qualities of the form.

Words in comics have their traditional linguistic meanings, but they are also drawn images that must be understood differently than words in prose-only works. Their line qualities and surroundings influence their meanings. Dialogue and narration are traditionally rendered at a later stage of production by a separate letterer, after the penciler and inker have completed their work. The size, shape, and color of lettering can denote volume, tone, or intensity, especially when representing speech. Bolding is especially common, typically multiple words per sentence. Sound effects, however, are drawn by primary artists as part of the images. These are onomatopoeic words or letters that represent sounds in the story world. Often the lettering style is so expressive it communicates more than the letters’ linguistic meaning.

Words are typically framed within a panel. Spoken dialogue appears in talk balloons (traditionally an oval frame with a white interior), internal monologues in thought balloons (traditionally a cloud-like frame with a white interior), and unspoken narration in caption boxes (traditionally rectangular and colored, though sometimes narration appears in separate caption panels or in white gutters). Adding a pointer to a word container and directing it at an image of a character turns the words into sound representations or, if a thought balloon, into representations of an unspoken but linguistic mental process, both linked to the specific place and time of the depiction.

The absence of a pointer on a caption box indicates that the words originate from outside of the depicted scene. First-person narration with no pointer may be linked to a remote setting if the words are composed by a character from some other, implied moment and location that is not visually depicted. Though the words in talk balloons are understood to be audible to characters, the drawn words and containers are not visible within the story world even when drawn blocking story elements. As with lettering, the size, shape, and color of containers communicate additional meanings about the words. For talk balloons, the graphic quality of the balloon edges denotes how the words are thought, spoken, whispered, shouted, etc. Finally, the containers create semantic units similar to line breaks or stanzas in poetry.

Words also influence and are influenced by surrounding images that are part of the subject content. Pioneering comics artist Will Eisner identifies two kinds of images: a “visual” is a “sequence of images that replace a descriptive passage told only in words,” and an “illustration” is an image that “reinforces (or decorates) a descriptive passage. It simply repeats the text” (132). Scott McCloud goes further, identifying seven “distinct categories for word/picture combinations” (). Two of McCloud’s categories, “word-specific” and “duo-specific,” correspond with Eisner’s “illustration,” while the other five (picture-specific, intersecting, parallel, independent, montage) fall under Eisner’s “visual,” which “seeks to employ a mix of letters and images as a language in dealing with narration” (139).

To indicate the level of image-text integration, we combine and arrange McCloud’s and Eisner’s categories in a spectrum, beginning with the highest level integration.

Montage visual: images include words as part of the depicted subject matter. This is the only instance in comics in which words are part of the story world. All other words are discourse only.

Interdependent visual: images and words communicate different information that combines.

Intersecting visual: images and words communicate some of the same information, while also communicating some information separately.

Image-specific visual: images communicate all information, while words repeat selected aspects.

Word-specific illustration: words communicate all information, while images repeat selected aspects.

McCloud also includes two categories that are not integrated, and we add two more.

Duo-specific illustration: images and words communicate the same information. Although this might appear to be the most integrated category, there is no integration if each element only duplicates the other so that no information is lost if either element is ignored. Words and images are independent.

Image-only visual: isolated images communicate all information. Since comics do not require words, this is the most fundamental aspect of the form.

Word-only text: isolated words communicate all information. This requires the highest level of reader visualization, an approach at odds with graphic narratives as a form.

Parallel visual: images and words communicate different information that do not combine. This requires the same level of reader visualization as word-only texts, but the presence of images complicates and potentially interferes with that visualization.

With the exception of the most integrated category, montage visuals, all combinations of words and pictures produce some level of image-text tension because words, unlike images, exist only as discourse. Though drawn on the page, words are not visually perceptible to the characters in the story. Images, however, depict content that is perceptible to characters, so drawn objects and actions appear as both discourse (ink on paper) and diegesis (the world of the story). A drawing of a superhero flying (discourse) communicates the fact that the superhero is flying in the story (diegesis). The words “the superhero is flying” communicate the same diegetic fact, but the ink-formed letterforms bear no resemblance to their subject matter. There is no overlap between diegesis and discourse. Since both words and images are made of ink lines on paper (because printed words are images), some lines in a comic exist only in the reader’s world and some appear to exist in both the reader’s and the characters’ worlds.

Graphic novels create further image-text tension by highlighting the potential gap between text-narration and image-narration. In graphic memoirs such as Art Spiegelman’s 1980-1991 Maus, Marjane Satrapi’s 2003 Persepolis, and Alison Bechdel’s 2006 Fun Home, the text-narrator and the image-narrator are understood to be the same person, the actual author. When a character in a graphic novel controls the first-person text-narration in caption boxes, it is not necessarily clear whether that character is also controlling the image-narration in panels. If the words are generated by an omniscient third-person text-narrator, does that same narrator generate the images, or are the images generated by a separate narrator?

Unintegrated image-texts imply a separate text-narrator and image-narrator. In the case of a duo-specific image-text, the two modes of narration duplicate information without any integration, as if two narrators are unaware of each other. Integrated image-texts, however, imply a single narrator controlling both words and images in order to combine them for a unified effect. At the center of the spectrum, a word-specific illustration implies an image-narrator aware of text but a text-narrator unaware of image. Similarly, an image-specific visual implies a text-narrator aware of images but an image-narrator unaware of text.

Parallel visuals are more complex; although the two narrations are independent and so seemingly unaware of each other at the level of the panel, the overarching effect is integrated. In such cases, the separate text- and image-narrations may create a double image-text referent, in which a word has one meaning according to its linguistic context but, when read in the context of the image, acquires a second meaning. Alan Moore is best known for this approach, having perfected it with Dave Gibbons in Watchmen.

 Image result for watchmen comic panels

I’m teaching “Superhero Comics” this semester, and so I’m once again pulling out Scott McCloud’s abstraction scale:

Image result for mccloud abstraction scale

It begins with a photograph of a face and ends with a face comprised only of an oval, two dots, and a straight line. McCloud calls that last face a “cartoon” and the middle face the standard for “adventure comics,” ie superheroes. All of the faces to the right of the photograph further “abstract [it] through cartooning” which involves “eliminating details” by “focusing on specific details.” Computer programs can do the same kind of stripping down:


But some “simplification” isn’t so simple. Look at the different between this photograph and its cartoon version.

make a caricature profile picture of you

McCloud’s scale actually combines two kinds simplification. Each step to the right of his spectrum appears simpler because: 1) the image contains fewer lines, and 2) the lines are in themselves less varied. A line becomes smoother by averaging its peaks and lows into a median curve, and so the second kind of simplification is a form of exaggeration. Since exaggeration can extend beyond averaging McCloud’s spectrum actually requires two kinds of abstraction. Each face is altered both in density and in contour quality. Density describes the number of lines; contour quality describes the magnification and compression of line shapes. Abstraction in density reduces the number of lines; abstraction in contour quality warps the line shapes. The less density reduction and the less contour warpage, the more realistic an image appears.

I like McCloud’s five-point scale though, so I’ll offer two of my own.

The Density Scale:

  1. Opacity: The amount of detail is the same or similar to the amount available in photography.
  1. Semi-Translucency: The amount of detail falls below photorealism, while the image still suggests photorealistic subject matter.
  1. Translucency: While reduced well beyond the range of photography, the amount of detail evokes photorealistic subject matter as its source material. This is the standard level of density in superhero comics art.
  1. Semi-Transparency: The sparsity of detail is a dominating quality of the image, and subject matter can evoke only distantly photographic source material. Semi-Transparency is more common in caricature and cartooning.
  1. Transparency: The minimum amount of detail required for an image to be understood as representing real-world subject matter.

The Contour Scale:

  1. Duplication: Line shapes are unaltered for an overall photographic effect. Though naturalistic, reality-duplicating line shapes exceed the norms of superhero art by reproducing too much information.
  1. Generalization: Line shapes are magnified and/or compressed to medians for an overall flattening effect that conforms to naturalistic expectations. Generalization is the standard level of abstraction for objects in superhero art.
  1. Idealization: Some line shapes are magnified and/or compressed to medians while others are magnified and/or compressed beyond their medians for an overall idealizing effect that challenges but does not break naturalism. Idealization is the standard level of abstraction for superhero characters.
  1. Intensification: Line shapes are magnified and/or compressed beyond their medians for an overall exaggerating effect that exceeds naturalistic expectations. If the intensification is explained diegetically, the line shapes are understood to be literal representations of fantastical subject matter within a naturalistic context. If the intensification is not explained diegetically, then the line shapes are understood as stylistic qualities of the image but not literal qualities of the subject matter. Explained or Diegetic Intensification is common for fantastical subject matter in superhero art; unexplained or Non-diegetic Intensification occurs selectively.
  1. Hyperbole: Line shapes are magnified and/or compressed well beyond medians for an overall cartooning effect that rejects naturalism entirely. Hyperbole is uncommon in superhero art because the stylistic qualities of the image dominate and so prevent a literal understanding of the subject matter. Hyperboles in a naturalistic context are understood metaphorically.

The two scales can also be combined into a Density-Contour Grid:

1-5 2-5 3-5 4-5 5-5
1-4 2-4 3-4 4-4 5-4
1-3 2-3 3-3 4-3 5-3
1-2 2-2 3-2 4-2 5-2
1-1 2-1 3-1 4-1 5-1

Both scales take photorealism as the norm that defines variations.

McCloud’s photographed face is the most realistic because it combines Opacity and Duplication, 1-1 on the grid, demonstrating the highest levels of density and unaltered contour. It’s opposite is not McCloud’s fifth, “cartoon” face, which combines Transparency with Idealization, 5-3; its level of density reduction is the highest and so the least realistic possible, but its contour warpage is moderate and so comparatively realistic. Replace the oval with a circle to form a traditional smiley face, the contour quality would rise to Hyperbole, 5-5, the most abstract and so the least realistic position on the grid.

Image result for smiley face

Cartooning covers a range of grid points, but most cartoons fall between 4-4 and 5-5, both high density reduction and high contour warpage. Charles Shultz’s circle-headed and minimally detailed Charlie Brown is a 5-5.

The characters of Archie Comics are some of the most “realistic” of traditional cartoons at 4-4.

McCloud’s middle, “adventure comics” face combines Translucency and Idealization, 3-3, the center point of the grid and the defining norm of superhero comic art.

Image result for superman

Like all grid points, 3-3 allows for a variety of stylistic variation between artists, within a single artist’s work, and even within a single image, but it does provide a starting point for visual analysis by defining areas of basic similarity.

Tags: , , , , ,

Reading a comic book is easy–even when there are no words to be read. You just look at a picture, and then at the next picture, and so on. But why do any of the pictures make sense side-by-side? What is your brain doing as it leaps from image to image?

In Understanding Comics, Scott McCloud defines the Gestalt psychology principle of “closure” as the “phenomenon of observing the parts but perceiving the whole” (though it more specifically indicates a viewer filling in visual gaps between disconnected parts) and applies it to comics gutters: “Nothing is seen between panels, but experience tells you something must be there!” He goes on to explain: “Comics panels fracture both time and space, offering a jagged, staccato rhythm of unconnected moments. But closure allows us to connect these moments and mentally construct a continuous, unified reality.”

McCloud focuses his analysis on gutters and therefore types of transitions possible between panels (though closure is independent of panels and gutters, since insets and interpenetrating images work the same ways). He comes up with six types:

Image result for scott mccloud panel transitions

They work reasonably well, but his focus on panel transition has always struck me as slightly off. When I use it in class, students often don’t come to a clear consensus when analyzing any particular panel sequence. Moment-to-moment and action-and-action, for instance, are often ambiguous, sometimes combining identical leaps in time. And since actions do occur in McCloud’s moment-to-moment examples (a women blinks!), it’s not exactly clear what constitutes an “action.” Aspect-to-aspect can also be indistinguishable from subject-to-subject, both of which may or may not involve a movement in time, and so may or may not also be moment-to-moment or even action-to-action. And scene-to-scene might be a location leap and so also a kind of aspect-to-aspect at the big picture level, or a scene-to-scene can be in the same location but at a different time–so then how much time has to pass for an old scene to become a new scene?

These are annoying questions, but they really do come up when you try to breakdown a panel sequence with a roomful of students. So instead of categorizing transitions, my colleague Nathaniel Goldberg and I categorized types of closure while drafting our essay “Caped Communicators: Conversational Depiction and Superhero Comics.” Instead one all-purpose “perceiving the whole” process, we see four very different kinds of closure, each of which can occur by itself or in combinations.

Spatial:  Subject matter drawn in separate images is depicted as existing in physical relationship to each other, typically as a result of panel framing. (What McCloud identifies as aspect-to-aspect, subject-to-subject, and some scene-to-scene transitions require spatial closure.)

Temporal:  Undrawn events are depicted to take place outside of events drawn in separate images, typically as a result of panel transitions and so occurring as if in gutters. (What McCloud identifies as moment-to-moment, action-to-action, and some subject-to-subject and scene-to-scene transitions require temporal closure.)

Causal: Drawn action is understood to have been caused by an element absent from a current image but drawn in a preceding image. (None of McCloud’s transitions, not even action-to-action, accounts for this type of closure.)

Associative: A metaphorical relationship is depicted between two images in which one image is understood to represent some idea about the other image. (Though McCloud does not identify this type of closure, Jessica Abel and Matt Madden in Drawing Words Writing Pictures add “symbolic” to McCloud’s list of transition types. Such symbolic transitions require associative closure.)

It always helps to look at specific examples, so consider this three-panel sequence at the top of page 28 in Watchmen #8:

In the first image, artist Dave Gibbons draws the shadow of a statuette cast over the face of a frightened man kneeling on the floor. The second image shows the statuette in the fist of an attacker. Taken together, spatial closure is required to understand that the two images occur within a few feet of each other, with each image drawn from one of the two men’s points of view; this is true even though the white background emphasizes the attacker’s action but eliminates all other setting information. The second image also requires temporal closure because the statuette is behind the attacker’s head at an angle that would not cast the shadow seen on the victim’s face in the first image. Gibbons therefore also depicts a movement forward in time during which the attacker has cocked his arm back to strike.

The third image depicts a jack-o-lantern crashing to the floor with some falling books. It uses all four forms of closure. The pumpkin exists in the same space as the two now undrawn men (spatial closure). The pumpkin is crushed at a moment immediately following the second image (temporal closure). The falling books have been knocked down by the now undrawn victim striking the shelf behind him (causal). And, because it resembles a human head and breaks open in the panel where a reader anticipates the statuette striking the man’s head, Gibbons implies that the man’s head has been similarly damaged (associative). Alternatively, the pumpkin, as shown in the first panel, is already falling, because the leg of another attacker has knocked the shelf off balance. If so, then the pumpkin and the head of the victim are smashed at the same moment, with no causal closure but associative only.

So our closure types are deeply indebted to McCloud, but I think they also improve on his. I’ll be testing these out in my classroom soon, so hopefully my students will agree. More on that later.

Tags: , , ,

Having taught my spring term seminar Superheroes a half dozen times now, I’m converting it to one of the gateway courses for Washington and Lee students entering the English major. The overhaul means jettisoning the pre-history of the genre (I love that stuff, but I could just hand my students On the Origin of Superheroes and be done with it) and focusing much more on comics as an art medium. So I’m trying to boil down the basics, the must-know criteria for analyzing a comic book.

So now it’s time invite Neil Cohn to the lectern. If you haven’t read his The Visual Language of Comics, please do. Meanwhile, here’s my boiling down of his visual language grammar.

Narrative panel types: images may be categorized according to the kinds of narrative information they contain and how that information creates a visual sentence when read in sequence:

Orienter: introduces context for a later interaction (no tension).

Establisher: introduces elements that later interact (no tension).

Initial: begins the interactive tension.

Prolongation: continues the interactive tension.

Peak: high point of interactive tension.

Release: aftermath of interactive tension.

Cohn only looks at comic strips, which typically express a single sentence in a linear arrangement of three or four panels, but longer graphic narratives can express multiple sentences on a single page or extend a single visual sentence over multiple pages.  To analyze the different ways that can work, I’m adding some terminology to Cohn’s.

Closed sentences: two sentences that begin and end without sharing panels.

Overlapping sentences: sentences that share panels.

Interrupted sentence: an overlapping sentence that does not complete or initiate its tension before another sentence replaces it; sentences might share an Orienter, or an Establisher may introduce two elements that do not interact until later as a form foreshadowing.

Dual-function panel: in overlapping sentences, one panel performs two narrative functions. A panel may, for example, serve as the Release of one sentence and also the Orienter, Establisher, or Initial of the next. Or an Orienter may  serve as the Establisher of an interrupted sentence that initializes tension later.

Sentence Layout: the relationship of visual sentences to pages.

Page sentence: a sentence that begins with the page’s first panel and ends with the page’s final panel.

Multipage sentence: a sentence that extends beyond one page.

End stop: a page and a visual sentence end simultaneously.

Enjambed: a page ends before the visual sentence ends, also called a visual cliff-hanger.

This is all awfully abstract, so let me give specific examples from The Walking Dead again.

Robert Kirkman and Tony Moore like enjambment. Their first issue includes several cliff-hangers. The bottom row of page five begins with an Establisher (introducing the door to the already established figure of Rick), is followed by an Initial panel (Rick removes the piece of wood holding the door closed), and ends with a Peak (Rick is opening the door).

But the Release only appears after turning to page six. That full-page panel is also a dual-function panel because it serves as the Establisher (introducing the zombies to the already established Rick) for the next, overlapping sentence.

The turn from page nine to ten is similar. The first panel in the bottom row of page nine is an Establisher (Rick and the bicycle), followed by the page-ending Peak of Rick’s shocked reaction. The top of page ten provides the Release (we finally see what he sees).

A similar grammatical pattern repeats on pages thirteen and fourteen. The first panel in the bottom row of thirteen is an Orienter. The second is an Establisher (Rick’s face seems to be reacting to something, a sound presumably), and the last panel is an Initial. Turn the page, and there’s the Peak.

The visual grammar also shows that cliff-hangers only work on the final panel of a two-page spread, in order to prevent a reader’s eye from skimming to the critical image prematurely (which happens in my arrangements above).

Also, Moore and Kirkman don’t always enjamb their visual sentences. Page one, for instance, ends on a Peak. The page also begins with an Initial, followed by four Prolongation panels. Page one is a complete page sentence, both beginning and ending on a single page.

Instead of a Release, the next page begins with an Orienter (Rick in his hospital room) for the next visual sentence, which does not overlap.

In terms of interrupted sentences, page ten begins a new visual sentence with the top Establisher (introducing the bicycle zombie to the already established Rick), followed by two Initials (Rick and the zombie interact) in the second. The bottom row begins with two Prolongations, followed by a Peak (Rick’s tear) and a Release (the zombie closes its mouth).

The visual sentence appears to have ended when the next page begins a new sentence with no further interaction between Rick and the zombie. So page ten reads as a complete page sentence, until the bottom of page twenty-three continues the interaction with a Prolongation panel, retroactively showing that the visual sentence was interrupted.

Page twenty-four provides a new Peak (Rick shoots the zombie), followed by two Release panels (Rick looking down, the zombie with a bullet hole in its forehead).

Rick’s tear is also a Prolongation of his tear on page ten, an additional overlapping sentence that reaches its Peak in the next panel when Rick wipes the tear away. The final three panels are Releases. They’re also their own overlapping, three-panel sentence: Initial (Rick and the car), Peak (Rick gets into the car), and Release (car has driven off). The page and the issue conclude with an end stop.

And, concluding this post, my apologies to the world of poetry for driving off with your terms “end stop” and “enjambment.” Until now they only meant “a poetic device in which a pause comes at the end of a syntactic unit” and ” the continuation of a sentence without a pause beyond the end of a line.”



Tags: , , , , ,

%d bloggers like this: