Skip to content

The Patron Saint of Superheroes

Chris Gavaler Explores the Multiverse of Comics, Pop Culture, and Politics

The introduction of The Comics Form separates two often and easily conflated kinds of comics: things that are called “comics” because they are in the comics form and things that are called “comics” because they are in the comics medium. I try to give each a pretty straightforward definition:

  • Works in the comics form are sequenced images.
  • Works in the comics medium are works published and identified as a comic by an entity that identifies as a comics publisher.

I realize “entity” is an odd term, but it covers the range of possibilities: mainstream comic book publishers, literary journals that publish comics (like Shenandoah where I’m comics editor), mini-comics made on photocopiers, etc. (I ignore a fairly obvious technicality though: not everything published by a comics publisher is a comic. But since my focus is on the comics form, not the comics medium, I decided not to go down that and related rabbit holes.)

Dividing form and medium produces three subcategories:

  • Works in the comics form but not the comics medium, which include all sequenced images not traditionally identified as comics.
  • Works in the comics medium but not the comics form, which include single-image cartoons.
  • Works in both, which include the vast majority of works in the comics medium.

Single-image cartoons have been the most overt stumbling block for producing a general definition of comics because they aren’t in the form but they are still routinely called “comics.” They’re also called “cartoons,” but that’s (primarily) a description of their style: simplified and exaggerated.

It doesn’t help that some “cartoons” are also sequenced images:

That’s by Politico‘s Mark Wuerker, who also edits the online magazine’s weekly selection of political cartoons. Since it’s divided into four sequenced images, it’s in the comics form. I would also say it’s in the comics medium, though the claim reveals a shortcoming of my above definition. Politico‘s “Cartoon Carousel” begins:

“Every week political cartoonists throughout the country and across the political spectrum apply their ink-stained skills to capture the foibles, memes, hypocrisies and other head-slapping events in the world of politics. The fruits of these labors are hundreds of cartoons that entertain and enrage readers of all political stripes. Here’s an offering of the best of this week’s crop, picked fresh off the Toonosphere.”

Wuerker’s description doesn’t mention the word “comics,” making Politico a cartoon publisher but not necessarily a comics publisher. Since Politico only publishes political content (the vast majority not cartoons), its cartoons are more specifically political cartoons, a kind of art that traditionally appears in newspaper editorial sections but not newspaper comics sections (AKA, “the funnies”).

But whether technically in the comics medium, Wuerker’s four-panel political cartoon is in the comics form. So is the Atlanta Journal Constitution‘s Mike Luckovich’s:

Or at least it’s in the comics form if you view it as consisting of more than one image. If you understand it instead as a single image of an elephant standing in front of two diegetically juxtaposed images–like a lecturing curator standing in front of two paintings in an art gallery–then it’s not in the comics form because it’s a single image.

I perceived it as three images because the two background panels are framed in a way that suggest a traditional comics layout, making the middle strip a gutter rather than, say, the white wall the images are hanging on. The rectangular panels are juxtaposed two-dimensionally, while the elephant (which of course is also juxtaposed two-dimensional since the entire cartoon is two-dimensional) appears to be juxtaposed three-dimensionally. Since there’s no diegetic space implied (Luckovich could have drawn the panel content instead as images on two TV screens, for example), I would call this an example of layout as a “secondary diegesis.” While all layouts are secondary diegeses, Luckovich makes that explicit by drawing the elephant as if “in front of” the two panels.

Brian Stelfreeze creates a similar effect in Black Panther #1 (June 2016):

I analyze that page at length in The Comics Form, but I think you can see the essential similarities, especially how the three background images are diegetically separate from the foreground, representing four scenes simultaneously.

Here’s another complication: single-image cartoons and sequenced images that are also in the comics medium often use many of the same conventions. Speech balloons, for example. Here’s Bill Bramhall from the New York Daily News:

There’s no sense Bramhall’s political cartoon is in the comics form because there’s no sense that it can be understood as more than one image. Either way, talk balloons are not part of the comics form. A work in the comics form can certainly include talk balloons, but having or not having talk balloons doesn’t determine anything. That’s true of works in the comics medium too. A wordless comic is still a comic (however you define “comic”). But speech balloons are a wildly common convention of the comics medium, including works that are in both the medium and the form (which is the majority of things we tend to call “comics”).

More interestingly, eating a talk balloons violates the impression that talk balloons are not part of the image’s diegetic world. Characters shouldn’t be able to see them, let alone touch and chew them. By pleasant coincidence, I’ve been corresponding with Rodolfo Dal Canto following the Invisible Lines conference in Venice earlier this summer, and he recently sent me a segment from an Italian comic that plays the same meta game as Bramhall. Bilotta, Righi and Ponchione’s “Gli uomini della settimana” is about “a superhero who can interact with the words in the comic book (balloons but also onomatopoeia) and use or modify them”:

I’ve also been corresponding with Lukas Wilde since we participated on a comics theory panel at NeMLA in Baltimore last spring, and he sent me a related cartoon too:

Talk balloons are like thought bubbles, caption boxes, special effects words, emanata, and frame edges–things that don’t (normally) exist in the diegetic world for characters to perceive. But viewers can perceive them as though they are kinds of physical objects (overlapping panels, for example), which is why I call layout a secondary diegesis. Characters manipulating speech bubbles is related because of the metafictional effect, but I’m still considering whether speech bubbles are (necessarily) elements of a secondary diegesis in the same way that an arrangement of panels is perceived as though it were a set of flat images placed on top of a page surface or on top of another image, as, for example, Sami Kivela does in Undone by Blood:

Luckovich employs a similar technique, but minus a rectangular frame edge for the elephant figure “on top of” the other panels, and with the addition of the elephant’s apparent metafictional awareness of the image arrangement as well as the implied viewer being addressed. Also, the speech bubble functions the same way as layered panels do, and so arguably simply is a kind of panel:

I think works in the comics form are more prone to metafiction because they must arrange their images in some manner, which then draws attention away from the image content (the primary diegesis) and toward the (often illusionary) effects of a resulting secondary diegesis.

And that applies to any work in the comics form, whether it’s also in the comics medium or, more specifically, in the genre of political cartoons.

A nice thing about publishing a new book is the chance to improve ideas from old books. While I do that more than once in The Comics Form: The Art of Sequenced Images, it’s also nice to revisit an old idea and find that I still agree with my old self.

In the last chapter, “Sequenced Image-Texts,” I try to identity all of the possible relationships between words and images. Step one is figuring out the range of how two things can relate, and after going back to what I told prospective comics artists in Leigh Ann Beavers’ and my Creating Comics: A Writer’s and Artist’s Guide and Anthology in 2021, I stuck with the same four interactions:

  • Duplicate: the two sets primarily overlap each other, neither contributing uniquely to the whole.
  • Complement: the two sets primarily correspond, one or both providing additional but congruent qualities to the whole.
  • Contrast: the two sets primarily contradict, each providing incongruent qualities to the whole.
  • Diverge: the two sets appear primarily unrelated, neither contributing to a whole.

This time though I added what I hope is a clarifying illustration. Duplicating features (mostly) overlap. Complementing and contrasting features partially overlap. Diverging features don’t overlap at all (and so suggest no basis for comparison or contrast).

Scott McCloud identified seven word/picture combinations in Understanding Comics, but I think these four are sufficient and, hopefully, clearer. Umberto Eco introduced three in 1965, missing the fourth because the comic he was analyzing, Milton Caniff’s 1947 Steven Canyon, unsurprisingly didn’t include an example. John Bateman, the go-to expert on all things image-texts, noticed of McCloud: “As usual there are some patterns which continue to recur” (2014: 99). He said similarly of an earlier analysis of children’s picture books: “It is in fact striking just how often similar lists are suggested in different areas without, apparently, very much interaction between the distinct inquiries” (2014: 73). Which means I’m happily joining a list of fellow wheel-reinventers.

Bateman also distinguishes between what he calls “internal” relationships, “where the text ‘is’ the image,” and “external” relationships, “where the text … relates to other images” (2014: 27). Building on that idea, I see two kinds of internal relationships:

  • First, if you’re looking at a word in isolation, the relationship is between what a word means and how it is drawn.

I was only allowed to included so many illustrations in The Comics Form, and this one I only mentioned, so I’ll include the actual images here. Bob Wiacek and Todd McFarlane’s The Incredible Hulk #340 (February 1988) title design complements. The meaning of the word “HULK” and the stylistic rendering of the letters as blocks of stone communicate similar but not identical ideas:

If instead of “THE INCREDIBLE HULK” the words were “STONE BLOCKS,” the meaning and style would duplicate. René Magritte’s 1950 painting The Art of Conversation employs a similar stylistic approach but for opposite effect. Painting the French word for dream, “RÈVE,” as blocks of stone contrasts the meaning of the word:

  • Second, if you’re looking at a wordless image, the relationship is between what the image represents and how it is drawn.

That’s probably clearest when you’re looking at two drawings of the same subject. Though Norman Rockwell and Al Hirschfeld are both drawing Bob Hope, their stylistic approaches are remarkably different:

Things get more complicated when you combine a word and an image, but most image-text analysis looks primarily at one relationship: between what the word means and what the image represents.

I particularly enjoy when that relationship contrasts. Last Christmas I got Lesley a tarot deck drawn by Michelle Tea, which includes this bonus card:

The words contrast the swords piercing the figure’s body–though I suppose they also complement the figure’s implied attitude as as she indifferently reads her phone.

Here’s another. The ice cream shop in my town has displayed this sign for years:

The meaning of “CASH” in the phrase “CASH ONLY” is paper money (which is reinforced by the small print: “CHECKS ACCEPTED / ATM AVAILABLE”). But combined with a contrasting image of the singer Johnny Cash, “CASH” gains a double referent. If the ice cream shop owner were slightly braver, he might instead display this sign:

The image then would trigger the words “Johnny Cash,” which would then trigger the homonym “cash,” producing the operative meaning. Since the shop owner likely does not want any confusion (plenty of folks probably wouldn’t recognize the singer), the image is a kind of repetition with a playfully superfluous additional meaning.

Complementing relationships are fun too. I snapped this photo while on a walk with Lesley last summer. The branches nearly obscure the words, yet they convey a similar idea:

There are three more relationships:

  • between what the word means and how the image is drawn
  • between how the word is drawn and what the image represents
  • between how the word is drawn and how the image is drawn

That’s a total of six relationships present in every image-text consisting of at least one word and one image. If an image’s subject is its meaning, maybe the easiest way to categorize them is:

  • word meaning and word style
  • image meaning and image style
  • word meaning and image meaning
  • word meaning and image style
  • word style and image meaning
  • word style and image style

Often there are more than one word and more than one image part, creating even more relationships. Keeping all of those webs of potential meanings straight is complicated, and usually not necessary. What matters is tracking the possibilities, and pausing when one yields an interesting result.

Consider the poster for the 2008 film I Can’t See Straight:

Looking first at just the word “Straight,” how it is drawn (curving font) contrasts its meaning (or one of them).

Its style also contrasts the style of the first three words, accenting “straight” by using a different font, a different color, and a lower placement.

The word’s more relevant meaning emerges in relationship to the image. The words in isolation would probably produce something like: “I can’t think clearly.” That meaning remains, but the image relationship extends it to include an explanatory pun through another contrasting relationship. What the words mean changes as a result of what the image represents.

The black font also duplicates the black of the foregrounded figure’s dress. The white of the first three words, “I Can’t Think,” duplicates the foregrounded figure’s earrings and second figure’s necklace, further linking the words to the characters.

The style of the image is sexualized, suggesting not just the general meaning of (the implied word) gay, but probably the imminent possibility of sex.

The book cover design for the novel uses the same words, as well as images of the same two actresses portraying the same two characters, but it produces different effects through different word-image relationships:

This time all four words use the same font. Though that font is relatively straight, without a contrasting curved font, that meaning of “straight” is largely absent. Instead of emphasizing “straight,” the use of a contrasting color for “think” emphasizes it instead. “Think”, however, doesn’t gain an additional meaning as a result. “Straight” still requires the image to refer to sexuality, but because the style of the image is less overtly sexualized (when compared to the movie poster), the effect is perhaps more romantic than sexual. The inclusion of the words “romantic” and “heart-warming” in the blurb reinforces that.

Personally, I would have combined the word styles from the movie poster with the image from the book design, but once again, no one asked me. With one exception, the above examples are outside the comics medium. The illustrations in Chapter 7’s “Embedded Relationships” section are all from image-text comics, so it’s fun to branch out.

The first comic book I remember reading is The Defenders #15 (September 1974). I didn’t know it was #15. I didn’t know comic books were numbered. I just went to the 7-Eleven every week or so and looked for titles I liked with new covers in the spinning rack. They came out monthly, but I probably didn’t know that either. I wasn’t as young as I probably sound.

I did notice that Issues ended with previews. In the final panel of #15, Dr. Strange declares: “By the Ancient One’s Eyes! What horror has Magneto wrought?” And Marvel’s omniscient narrator promised in the bottom margin: “Afraid you’ll have to wait till next issue to learn the answer to that, Marvelite—but we strongly suggest you be here to meet … Alpha, the Ultimate Mutant.” And so I would have known to keep an eye out for a Defenders cover with that subtitle.  

It wasn’t until I was at some friend’s house looking through his collection that I came across that next issue. This could be a year, maybe two, even three years later. I was startled and excited—and then embarrassed when it became clear that front cover numbers were a well-established publishing norm that only I had somehow not decoded.

I didn’t actually read #16 for more than three decades. Marvel published Essential Defenders Vol. 2 in 2006, but my copy says: “Xmas 2009, Love, Lesley.” I think I had just started teaching my first superheroes course and had asked Santa for a sock full of nostalgia. The reprint volumes are in black and white, so when I started drafting what I hope will be my next book, The Color of Paper, I ordered an original copy too. Alpha the Ultimate Mutant dominates the splash page:

Color artist Glynis Wein renders his skin a consistent pink (probably 25% magenta and 25% yellow), while his body and facial features transform radically. “With each discharge of my power,” Alpha explains near the end of the issue, “my evolution has progressed with incredible alacrity.” His first line on page two was: “UUNNHH! UUNNHH!” Another member of the Brotherhood of Evil Mutants calls him a “big hairless gorilla,” and the narrator calls him a “Neolithic mutant.” By the middle of the issue, Alpha’s head is indistinguishable from Professor X’s, and by the end he’s a variation of the standard bulbous-headed alien (as seen on the cover).

So Glynis Wein (her husband, Len, scripted the issue) makes a visual claim: skin color is unrelated to evolution. The Neolithic gorilla version of Alpha has the same 25M25Y colored skin as the Professor X and bulbous-brained versions. I don’t know if Len Wein is making the same claim, since gorillas have nearly black skin. Perhaps the evil mutant Unus’s comment was a reaction only to Alpha’s hairless physiognomy, which was sufficiently gorilla-like to override his contradictory skin color. Or perhaps while Len Wein was scripting the issue, he pictured an initially dark-skinned Alpha?

Like most comics scripts, the one that Sal Buscema used while penciling is lost. After the first four pages, Alpha teleports himself and the Brotherhood to New York’s United Nations plaza, where he reappears with not just Glynis Wein’s White skin but also with what I imagine most viewers would register as White facial features, what Len Wein’s narrator describes as “a somewhat different Alpha!” Buscema’s and inker Mike Esposito’s depiction of Alpha’s features during the first four pages are strikingly different: his cranium is small, his nostrils are proportionately wide, and his lips and jaw protrude. He’s a racist caricature of a Black man.

When the issue was reprinted in black and white, the interior areas of line art that demark Alpha’s skin are the color of the off-white paper. Though that color is as consistent as Glynis Wein’s 25M25Y, it does not have the same representational qualities. Viewers do not perceive Alphas’s skin as off-white because the interior areas of all line-enclosed shapes are also the same off-white. His skin color, like the color of most objects, is ambiguous. Viewers would have to determine it according to other non-color visual clues.

A viewer who has seen a color art representation of the Hulk will likely recall that his skin was green and apply that knowledge to his black and white representation in the Defenders reprint. The off-white interiors of his skin-shaping line art would be understood to represent green skin. The off-white interiors of White characters—which includes all nine other repeated characters—would represent the color of White skin. It’s possible that viewers would recall the specific rendering of 25M25Y from earlier artwork, but I suspect they would visualize in a more limited sense and simply understand the characters to have skin that falls in the range of White skin.

But Alpha is different. His appearance on the splash page of #16 is his first appearance anywhere, so prior knowledge of his skin color is not possible. Because Glynis Wein gave him the same color as the White characters, Alpha appears White. But with no color information, how would a viewer interpret his skin color based on his facial features?

I suspect the racist caricature would trigger an impression of dark skin—not because viewers agree with the racist caricature but because they would perceive that the artists are intending to represent a fantastical version of a Black man. Viewers may consider the caricature as an offensively inaccurate portrayal but still perceive the representational intent. If so, the color of the paper within Alpha’s skin edges is likely perceived as falling in the range of Black skin.   

Glynis Wein’s color art blocks the possibility of that caricature-triggered perception of dark skin. Alpha’s White skin may also lessen the impression of his facial features. If so, the 25M25Y interiors alter the meaning of the black marks otherwise interpreted as racistly exaggerated Black physiognomy. The color art also presents a contradiction for viewers: is skin color or physiognomy the primary marker of race? If color defines Color, then Alpha is essentially (and consistently) White. If color doesn’t define Color, then Alpha begins as a Black super-gorilla and evolves into a White godling.

When Alpha teleports the Brotherhood to the United Nations, a man with a pointy goatee and a turban shouts: “By the eyes of Allah! It is … Magneto again!” Glynis Wein gives him redder skin, similar but darker than the skin of the figure next to him wearing what might be an orientalist robe and fez. Magneto removes another turbaned man from the podium before declaring his demands as the leader of “Homo Superior”: “produce a document granting mutants everywhere supreme domination over every civilized country in the world!” After battling the Defenders at Magneto’s command, Alpha eventually recognizes that Magneto lied about the Defenders being evil. Magneto explains:

“You were born an emotional infant! You couldn’t be expected to understand the reasons for our actions — — or the vicious persecution that forced us to them! Regardless of the deception, you are still a mutant – a mutant just like me! You must stand with us –with the others of your kind! It’s the only hope we have!”

Magneto, the Brotherhood, and the Defenders are colored identically. The only non-25M25Y-colored characters are the briefly glimpsed United Nation members. None appear Black, and Alpha’s evolved features no longer evoke caricatural Blackness either. Mutantkind then is apparently and exclusively White. Though the reference to “vicious persecution” could suggest a range of horrors in U.S. history, including the Jim Crow laws that ended a decade earlier, the exclusion of Black skin suggests otherwise. By blocking an understanding of Alpha as a mutant Black man, however racistly drawn, Glynis Wein’s color art defines Homo Superior as non-Black even at its most “Neolithic” stage.

I don’t recall my reaction to the Alpha artwork when I first glimpsed it c. 1976. I was about ten. I lived in an all or nearly all-White suburban neighborhood outside Pittsburgh. Since I hadn’t yet deciphered the complexities of numbering consecutive issues of comic book series, I doubt I registered the racial puzzle of Alpha’s appearance or the social context that heightened its meanings.

I am delighted to announce that Bloomsbury just officially published The Comics Form: The Art of Sequenced Images. My last three books were two-in-one team-ups, so it feels pleasantly strange to be solo authoring again. This is also a culmination of work I’ve been building toward for about seven years (a very very early version of a chapter subsection appeared as an essay in 2016). I’m humbly hoping to offer some helpful ways to rethink foundational ideas in comics theory, but others will have to determine just how helpful they are. The book is priced for institutions (hardback and ebook), so probably not something most folks will be buying for their own shelves, but I do hope that anyone with a deep interest in comics will nudge their nearest library to purchase a copy.

I also thought the clearest way to preview the content is to start at the end. That’s how you write a mystery novel, but in this case I mean it literally. Below is the actual Conclusion to The Comics Form. It condenses the preceding 206 pages into three very very dense pages. Rereading them now out of context hurts my brain a little, so good luck. Still, I hope there are enough intriguingly shaped fragments in there to pique your interest to explore the book. Raise your hand if you have any questions (ie, email me).


A poem consisting of fourteen blank-verse lines following an ABABCDCDEFEFGG rhyme scheme is a Shakespearean sonnet because it is in the Shakespearean sonnet form. A work in the comics form is formally a comic for similar reasons. Unlike Shakespearean sonnets, however, a comic may be defined by other than form. A work may be a comic contextually, stylistically, conventionally, or by other criteria independent of or non-exclusive to form. Though it may be a comic according to multiple sets of defining criteria, if the work satisfies one set but not another set, it is both a comic and not a comic. The apparent paradox is due to each set using the same term, even though each usage is distinct. A ‘comic’ is not a ‘comic’ is not a ‘comic.’

The Comics Form defines the comics form by extracting the two most common physical features from a range of comics definitions and combining them as ‘sequenced images.’ Sequenced images may or may not define comics generally, but if a work consists of sequenced images, the work is in the comics form and so can be analyzed formally as a comic. Although that may be sufficient for the work to be considered a comic, others might understand that a work must be, for example, a mass-produced replica or be created, produced, and purchased with the understanding that it is a comic. I refer to such media-defined works as the comics medium. A single-image cartoon in a newspaper comics section is in the comics medium, but because it is not in the comics form, it is outside the scope of this study.

The terms ‘discourse’ and ‘diegesis’ differentiate images’ physical qualities and representational qualities. All images have discourses and many also have diegeses. Since I derive ‘image’ from extant comics definitions, I also infer its discursive constraints: an image in the comics form is a visual, static, flat image juxtaposed with another. If it is also a representational image, it represents some subject matter: the diegesis experienced in the mind of a viewer interpreting its discourse. Like ‘discourse,’ ‘diegesis’ has other usages, but I adopt an expansive meaning: all representational content, either overtly depicted or implied, including the larger context of a world. Diegeses vary between viewers but also presumably overlap significantly. Non-representational images have no diegeses, only discourses—which must still be mentally experienced, but without the construction of a mental model with diegetic qualities understood to be separate from the discourse.

Analyzing representational images involves a range of approaches for relating their discursive qualities to the diegetic qualities they produce. Rather than focusing on unknowable authorial intentions or the illusion of intentions in characters, I focus on viewers’ experiences of intentionality in author-constructed narrators, focusing first on image-narrators, which communicate diegetic content through an image’s discursive qualities. An image’s style is in one sense discursive: an arrangement of marks on a surface. In another sense, style is diegetic: subjects depicted in a certain manner. Style may then be understood as semi-representational: discursive qualities that represent subjects non-literally and indirectly. Style may follow certain norms or modes, including cartooning and naturalism, as well as other combinations of exaggeration and simplification. Those norms are understood to represent subject matter through overspecified or underspecified details that are not aspects of the diegesis, except indirectly through connotations. Viewpoint and framing effects are similarly semi-representational.

To be in the comics form, a work must include more than one visual, static, flat image. Distinguishing multiple images from a single image with multiple units poses a challenge. Common terms ‘panels,’ ‘frames,’ and ‘gutters’ are metaphors to describe drawn qualities that are not determining. Unless images are physically divided—two framed paintings hanging on the same wall, for example, or two facing pages—image division is determined by viewer perception. A physically unified page or canvas of visual subunits consists of multiple images only if a viewer perceives it to be multiple images. Like style, a layout of gutter-divided panels straddles discourse and diegesis. The division of images is neither a discursive quality nor a diegetic quality in the same sense as the images’ subject matter, because the divisions are not part of that diegetic world. Where style seems diegetically transparent (subjects are drawn as if accurately reflecting their literal appearance in their world), image divisions seem discursive (images are drawn as if separated physically in the discourse). Distinguishing the two effects, style is semi-representational and layout is pseudo-formal. Because pseudo-formal qualities are not physical qualities like page dimensions and divisions, pseudo-form is a kind of diegesis, but one separate from the primary diegesis of the representational content, and so a secondary diegesis.

For images to be sequenced and so to be in the comics form, they must be juxtaposed in one of three possible ways: 1) contiguous: images appear simultaneously within a single visual field; 2) temporal: an image appears immediately after a previous image in the same visual field; or 3) distant: a non-contiguous image that does not immediately follow a previous image is mentally recalled while observing a current image. Contiguous juxtaposition describes the pages of most works in the comics medium in which viewers understand panels to be separate images. Temporal juxtaposition is the norm of films but occurs in static images when, for instance, a viewer turns a page. Distant juxtaposition is dependent on memory and so may be juxtaposition in only a metaphorical sense. A viewer of a sequence may at any time recall a previous image and relate it to a current image, discursively, diegetically, or both. Contiguous juxtaposition also includes braiding effects (with and without repetition) in which the discursive relationship of visual elements influences a viewer’s understanding of their corresponding diegetic qualities.

Juxtaposed images trigger inferences. The most fundamental inference is recurrence: marks in separate images are understood to be representations of the same subject. Recurrence is reinforced by a parallel phenomenon, diegetic erasure, in which discursive qualities that would produce diegetic contradictions are ignored. Juxtaposition produces ten additional types of inferences: 1) spatial: images share a diegetic space; 2) temporal: images share a diegetic timeline; 3) causal: undepicted action occurs between depicted moments; 4) embedded: one image is perceived as multiple images; 5) non-sensory: differences between representational images do not represent sensory reality; 6) associative: dissimilar images represent a shared subject; 7) semi-continuous: discursively continuous but representationally non-continuous images are perceived as a single image; 8) continuous: images are perceived as a single image; 9) match: otherwise dissimilar images share matching similarities; and 10) linguistic: images relate primarily through accompanying text. The first seven are diegetic only; the second two can occur both discursively and diegetically, or discursively only; and the last is not primarily a result of image juxtaposition and so arguably is not a type of juxtapositional inference. While the subtypes of diegetic inferences are only discernible through analysis of the story world, discursive inferences must be analyzed at the level of the page. Purely discursive inferences involve only discursive marks understood in terms such as shapes and values without reference to representational content.

For images to be sequenced, they must be juxtaposed, but the relationship between sequence and juxtaposition is ambiguous. The juxtaposed images of a sequence follow a specific order. The juxtaposed images of a non-sequence, or set, follow no specific order. Since a set can be juxtaposed contiguously, when, for example, organized into a book or gallery, order has two kinds: 1) discursive order: the successive but non-diegetic arrangement of images, reflecting only happenstance, convenience, and/or the needs of physical presentation; and 2) sequential order: the successive arrangement of the images, reflecting some diegetic quality of the representational content. Sequential order and discursive order are identical for sequences.

When images, whether sets or sequences, are contiguously juxtaposed in a visual field, viewing them produces a viewing path that is either: 1) directed: determined by the image order of the sequence; or 2) variable: indeterminate and so open to multiple discursive orders. Since image orders and image viewing paths are apprehended simultaneously, both are an additional type of juxtapositional inference in which a viewer determines relationships between contiguous images. If other inferences (recurrence, spatial, temporal, etc.) suggest a sequential order organized in a directed viewing path, the images are hinged. Unhinged viewing describes variable viewing paths of a set arranged discursively but non-sequentially.

The image qualities of content, relationship, order, paths, and hinges suggest a six-part typology: 1) representational sequence: two or more related and ordered representational images, with, if contiguously juxtaposed, hinges that produce a directed viewing path; 2) non-representational sequence: two or more related and ordered non-representational images, with, if contiguously juxtaposed, hinges that produce a directed viewing path; 3) representational set: two or more related but unordered representational images, that, if contiguously juxtaposed, are viewed in variable paths; 4) non-representational set: two or more related but unordered non-representational images, that, if contiguously juxtaposed, are viewed in variable paths; 5) representational arrangement: two or more unrelated and unordered representational images with no contiguous hinges; and  6) non-representational arrangement: two or more unrelated and unordered non-representational images with no contiguous hinges. 

Representational sequences also provide a means to explain and constrict McCloud’s closure. Viewers make inferences about a diegetic world based on image content in combination with independent knowledge and assumptions about the depicted world. Those assumptions are usually mimetic, applying, for example, laws of physics to objects and human psychology to characters. The undrawn content that viewers experience through the juxtaposition of two or more representational images is the minimal content required by a viewer’s mental construction of a partially drawn but fully implied event, consisting of definable subunits. While anything could occur in the ambiguous lapse of time implied by paired images of discreet moments, viewers understand the images as parts of a unified event. According to event inferencing, any content that is not part of that event is not implied.

Though text is not a necessary quality of the comics form, many sequenced images contain text. An image-text is an image that combines linguistic and non-linguistic content, and sequenced image-texts are in the comics form. Since all text is necessarily images, there are two types: 1) word-image: an image with linguistic content; and 2) word-image art: a word-image rendered as graphic art. Since word-images and non-linguistic images function as though on parallel and independent paths, image-texts involve three kinds of narrators: 1) image-narration of non-linguistic content; 2) text-narration of linguistic content; and 3) image-text narration of combinational effects of linguistic and non-linguistic content, which produces embedded relationships including double referents.

These are the qualities of sequenced images, which together explain the comics form.

This a page from Dennis Hopeless and Javier Rodriguez’s 2016 Spider-Woman: Baby Talk. It’s another of my favorites, which is why I include it in The Comics Form: The Art of Sequenced Images. Or rather I include a gray-scale version of it. Though Bloomsbury gave me space for a quite a few illustrations, book publishing is always more limiting than blogging. While I couldn’t subdivided the page to talk about each juxtaposition there, I can here. The illustration appears at the end of Chapter 5’s discussion of juxtapositional inferences, and my analysis also draws on some terms from earlier chapters:

  • Discursive: qualities of the physical image.
  • Diegetic: qualities of the represented subject.
  • Spatial inference: images share a diegetic space.
  • Temporal inference: images share a diegetic timeline.
  • Causal inference: undepicted action occurs between depicted moments.
  • Continuous inference: images are perceived as a single discursive image.
  • Semi-continuous inference: discursively continuous but representationally non-continuous images are perceived as a single image.
  • Erasure: viewers do not register discursive elements that do not fit their diegetic understanding.
  • Ocularization: image viewed as though from the angle and proximity of a character.

The character is Captain Marvel, but because she goes unnamed, I don’t mention her by name. Though the page features a 3×3 grid, five of the panels produce a unified image through continuous inferences. Numbered by Z-path viewing, panels two, five, six, seven, and eight are a single discursive and diegetic unit.

Because that unit is the dominant feature of the page, it likely encourages a viewer to apprehend the page as a whole first rather than beginning in the top left corner as the viewing norms of a 3×3 grid would prompt. The centered “KOOF” art in panel five also likely attracts a viewer’s eye to the center of the page and so to the center of the continuous unit first. Switching to Z-path viewing requires some loosening of the continuous effect to apprehend each panel individually. Even though it contains only an ellipse and so indicates no spoken sound, the presence of a speech container in the top left panel encourages a Z-path by drawing a viewer’s eye to its starting point.

In terms of spatial inferences, the setting is minimal and is represented only by an undifferentiated green background discernable in most panels. The green is likely perceived as transparently literal, even if its slight discursive variations are not. The discursively white background in the bottom left panel, however, is likely diegetically erased or, if noticed, understood non-transparently through non-sensory inference.

Since the substitution of white for green probably suggests nothing diegetically, it likely does not trigger associative inferences. The white then is only for discursive effect.

Spatial inferencing also explains the differences between each pair of panels as the result of the differences in the position of the implied viewer in relation to the central and unmoving figure. The first juxtaposition involves a change in the implied viewer’s proximity and angle. The second juxtaposition involves both, too, but while the proximity of the third image follows the trajectory of the first two images (the implied viewer is nearing the figure), the angle instead reverts back to the forward-facing view of the first panel. Because such movements do not correspond to the movements of an implied character, the spatial inference suggests no ocularization.

But the final image could be ocularized by either of the two characters present in the backgrounds of panels six and seven. Even if non-ocularized, the implied viewer has rotated 180 degrees, explaining why the central figure is no longer facing forward.

Alternatively, the figure has turned herself around as inferred through a causal inference. If so, she is now facing the two background characters, who, due to framing, do not appear in the image. I suspect most viewers would interpret the figure’s ending posture as speaking over her shoulder to the characters behind her, and so no causal inferences would be involved. If the figure did turn around, a viewer would experience a different kind of temporal inference, one indicating a slightly greater period of time to account for the implied action.

Temporal inferencing is also necessary for parsing the five-panel continuous unit. The unit, because it appears more than once in the linear panel progression, represents more than one diegetic moment despite being discursively recurrent. Calling the combined continuous unit “A,” the linear viewing sequence is: 1, A(2), 3, 4, A(5–6), A(7), A(8), 9.

First, note that temporal inference combines panels five and six into a two-panel continuous unit within the larger five-panel unit because the mid-air position of the flying fragments indicates the same moment. The space between panels five and six therefore divides neither spatially nor temporally.

The space between panels seven and eight does divide temporally because the placement of a speech container within each segments time.

Similarly, the space between four and five is temporal because of the action of the figure crushing the object (identified in the story as a phone). The change in the color of her glove from red to white is presumably the result of the electrical discharge from the crushed phone. The three changes (untightened to tightened fist, red to white glove, whole to shattered phone) all produce the temporal inference.

Temporal inferences also challenge the transparency of the apparent five-panel continuous image. Though panels A(2) and A(5–6) appear to represent a unified image and so a unified moment, panel four establishes that the phone is not yet broken during the moment of A(2), meaning the effect between A(2) and A(5–6) is not continuous but semi-continuous.

It only appears to be a single continuous image, but the corresponding position of her fingers during the moment of A(2) could not match. In the first panel, which is temporally closer to A(1) than to A(5–6), the figure has not formed a fist yet since one of her fingers is higher on the phone.

Temporal inferences produced by the speech containers also requires that A(7) and A(8) represent different moments from the rest of the apparent continuous unit, producing instead a semi-continuous unit of a single, stationary figure viewed from the same angle and proximity, but subdivided into four temporal units.

Finally, while panels two and five are not contiguously juxtaposed, viewers likely recognize them as diegetically continuous.

The chapter ends with discussions of four other pages too (from Black Panther, Batwoman, and Hawkeye). These close viewings (I avoid using the verb “read” for visual analysis that involves almost no actual reading) were some of my favorite parts to draft.

That’s a page from Matt Fraction and David Aja’s 2012 Hawkeye: My Life as a Weapon, one of my all-time favorites. I discuss it in The Comics Form: The Art of Sequenced Images, but the very reasonable limitations of book publishing prevented me from dividing the page into various smaller units to illustrate the point-by-point analysis.

That’s what blogs are for.

I really do try to avoid jargon, but I have some terms I find increasingly useful, so I’ll gloss them first:

  • Discursive: qualities of the physical image (in this case that’s pixels).
  • Diegetic: qualities of the represented subject (in this case that’s the world of Marvel Comics).
  • Spatial inference: images share a diegetic space.
  • Temporal inference: images share a diegetic timeline.
  • Continuous inference: images are perceived as a single image.
  • Semi-continuous inference: discursively continuous but representationally non-continuous images are perceived as a single image.
  • Associative inference: dissimilar images represent a shared subject.

The characters here are of course Hawkeye and, well, Hawkeye, AKA Clint and Kate. Their names aren’t mentioned on this splash page, so I don’t refer to them by name either.

Here it goes:

Uniform vertical and horizontal negative spaces divide the page into twelve rectangular but otherwise irregular images. The presence of words from a continuous statement (“Okay—this looks bad … Don’t die.”) placed inside caption boxes in the top left corner and the bottom right area encourage a general top left to bottom right viewing, but the norms of either Z-path rows or N-path columns are disrupted by the image arrangement and image content, producing no clear viewing order or even a method for naming images discursively. If viewers do begin with the first captioned word, they likely switch to apprehending the page as a whole, focusing first on the largest panel and the dominant content of the two figures diving underwater as bullets speed past them. Because the bullets’ paths through the water follow the same lines as those in the top left corner, continuous inferences unite the panels as a single unit, even though they share only corners and no borders. After closer inspection, the bottom left panel likely joins the same continuous unit.

The bullet trajectory lines in the top right corner also follow roughly the same pattern, so a viewer’s eye may be drawn there next. Spatial inferencing establishes a more distant implied viewer since the bullet lines (which are through air rather than water and so literal only if understood as a kind of blur) and the figures are smaller. The contrast in distance blocks the continuous effect otherwise encouraged by the layout, since the top right panel content is also “above” the diegetic content of the larger panel below it discursively. Since the two are viewed from similar angles, it is only the difference of proximity that divides them diegetically. That almost continuous effect is further suggested by the near alignment of the pool edges in the top corner panels, which is itself heightened by the placement of a horizontal negative space between the middle panels in the same row, creating the discursive illusion of an intermediary step between the differing representations of the pool edge.

The second panel along the left margin also features a similar pattern of bullet trajectories, but spatial inferencing again divides the panel from the continuous unit, this time through an implied viewer placed in contrastingly close proximity to the bullets.

Two other panels in the top left region stand apart discursively because of the accenting use of yellow. The two are also diegetically linked because they feature the guns firing the bullets.

Diegetically, the second panel along the top margin belongs above and to the left of the first panel—a mental rearrangement that a viewer experiences through spatial inferences. The content of the second yellow-dominated panel is also diegetically “below” the discursively higher panel, but with the same change in proximity of the implied viewer as the panel abutting it, so that the bullets in the water and the bullet shells in the air are similar sizes. The two yellow panels also produce their own two-panel viewing path that cuts diagonally across the opposite diagonals of both the paths of the bullets and the reading path of the captions.

The seven panels described so far could occur simultaneously. The smallest remaining panel, the lone square positioned near the center of the page but grouped with the close-up panels, appears to represent a similarly close view of churning water and so is also likely understood to occur at the same moment.

Four images remain. Two are ambiguous because their content is not overtly related, and the other two produce temporal inferences distinct from the rest of the page.

First, the figure in a bikini from the top right panel recurs in a lower left panel; instead of being struck by a bullet and falling from a standing position, the figure is floating face down in the water. The diegetic trajectory of the figure’s previous falling posture aligns spatiotemporally with the body’s later position in the pool, producing a causal inference. The juxtaposition also requires a new implied viewer position situated on the other side of the pool and so opposite the implied viewer of the other images. The apparent stillness of the figure and the water suggest a similar temporal leap to a later moment well after the action depicted in the surrounding images.

The recurrent figure also creates a two-panel viewing path that echoes the right-to-left path between the two yellow-dominant panels that appear discursively “before” them if a viewer is attempting a general top left to bottom right direction. The second pair of panels are more discursively distant to each other than the first pair, but in the same diagonal relationship, and so their placements also echo the widening trajectories of the bullets in the continuous unit.

The final image in the bottom right corner features the recurrent figure of the diver discursively above it, producing the spatiotemporal inference that he is no longer descending into the water from the force of his dive but has now slowed. Though his arms are framed out of the image, they appear to be at his sides. The temporal inference would divide the two images by roughly seconds. This means that the “last” discursive image on the page is not the “last” diegetic image because the previously described panel containing the floating corpse would follow it temporally.

Finally, two of the panels feature graphic designs surrounded by uniformly black areas that seem disconnected from the diegetic events, in part because they do not suggest the illusion of three-dimensionality. Viewers with prior subject knowledge will recognize the higher panel’s spiral pattern as the symbol of the story’s villain featured on his hat, producing an associative inference. Viewers without prior knowledge will be introduced to the symbol and character six pages later, applying the associative effect retroactively. The juxtaposition of the spiral icon beside the image of the firing guns may suggest that they are being fired either by the villain or, since there are multiple guns, on his behalf.

A lower left panel contains another ambiguous icon, also later revealed to be associated with another villain employed by the first for similar foreshadowing.

Something I don’t mention in The Comics Form but that I’ll address here: Aja’s art works against Fraction’s tone and superhero genre norms. Hawkeye’s internal narration is comic and contrasts the dire situation. But it’s considerably less comic for the words “Okay—this looks bad … Don’t die” to appear between images of bystanders being struck by stray bullets and a floating corpse. Worse, because the deaths go unnoted by the heroes later, it suggests their indifference.

I assume that wasn’t the intention. Fraction’s script likely didn’t mention any deaths because Aja invented the incident while drawing. The heroes are unconcerned about dead bystanders because there were none. Or at least there were none as originally written by Fraction. The comic, however, is not the script.

Since I discussed the new Philosophy of Comics in two previous posts (here and here), I thought it fair to give the authors a chance to respond, which they kindly do below.

Guest bloggers, Sam Cowling and Wesley Cray


We can’t thank Chris enough for the kind words about the book. Happy to report, too, that we specifically requested to Bloomsbury that they approach him for a blurb, given how useful we found his own work as well as his collaborative efforts with Nathaniel Goldberg. (Big plug for Chris’ “‘Something Like This Just Couldn’t Happen!’: Resolving Naturalistic Tensions in Superhero Comics Art” in Studies in Comics.) And we’re grateful, too, for the chance to say a bit in response to Chris’s perceptive remarks here.

There are a lot of moving pieces, and Chris points in some fruitful and interesting directions that we won’t be able to tackle in a short post. Mostly, we’ll aim to give a rough sense of how we are inclined to approach the taxonomic questions around comics (the things) and ‘comics’ (the word or concept).

It’s scarcely debatable that the term ‘comics’ is vague. When philosophers think about vagueness, we typically model it using precisifications: various contextually acceptable means of making the meaning precise. In this way, precisifications are sharpenings. One way (but surely not the only way!) to view the literature on defining ‘comics’ is as a purely semantic debate that presupposes that there’s a uniquely acceptable sharpening. We’re pretty happy to grant that there are actually a range of acceptable sharpenings, but that their aptness depends upon the context of inquiry and conversation. For our part, the aim of the first chapter of our book—aside from the implicit aims of introducing different kinds of comics and some general philosophical methodology—is to ask what might demarcate the most general sort of precisification that folks have in mind when they, say, debate who should win an award at Angoulême, be featured in Kramer’s Ergot, or be read in a class specifically centered on comics. Our best shot—a shot that requires a lot more than a chapter to count as a full-fledged account—is that comics are artifacts produced to be engaged with through a certain kind of reading.

We’re super interested in Chris’ proposal. (What are pre-orders for, after all?) In part, it’s because we’re not sure what a “form” is. We recognize that there can be—and is—ample disagreement about what categories count as mediums—e.g., Matthew Smith and Randy Duncan suggest that there isn’t a general medium of comics in The Power of Comics. Our untutored, prima facie hunch is that forms are intimately bound up with aesthetic engagement. Roughly, “form” is properly viewed as a distinctive kind of aesthetic category that subsumes all and only those things that are created with suitably related aesthetic purposes, engaged via suitably related aesthetic techniques, and evaluated using suitably related aesthetic criteria. That conception of form would make one sharpening of ‘comics’ pick out a unified aesthetic kind. It would presumably exclude certain artifacts that fall outside that aesthetic remit. That’s surely a relevant sense of ‘comics’ that we would want to point to in making sense of the medium—e.g., it would explain why certain kinds of non-art instructional comics might fall outside of the relevant form. We’re eager to hear more about Chris’s views and see if they point in this direction.

Pending a better sense of how to disentangle forms from mediums, our lone departure from Chris is probably about the semantics of ‘comics’. Homophonic ambiguity of the sort Chris mentions seems less apt than polysemy for capturing the proposed distinction in meanings. Since the meanings are clearly related—e.g., like ‘face’ (noun) and ‘face’ (verb)—rather than accidental homonyms. Our hunch is, however, that the multiplicity in meaning is more modest—that it’s vagueness rather than ambiguity (or polysemy).


What should our focus be trained on when taking up the venerable (but largely frustrating) question: which things are comics? In Philosophy of Comics, our general hunch is that we are best served to focus on comics as artifacts and, in doing so, keep in view their commonalities with things like bathtubs, doorknobs, and cowboy hats. They are human creations, produced by intentional processes, with certain kinds of functions in mind. As Chris aptly notes, any approach of this sort is liable to be messier than accounts that focus upon specific formal elements or historical traditions that seek a precise account of comics. This is liable to leave the account we prefer comparatively vague, but notice that it would be a tremendous surprise if we had laser-like clarity in definitions of other sorts of artifacts like bathtubs or doorknobs. That doesn’t mean, of course, that anything goes, and in this short note we’ll say a bit about how we’d respond to Chris’ criticism.

Two quick caveats before digging in: (1) In the book, we place heavy emphasis on the practice of picture-reading and mark the centrality of a theory of picture-reading for comics theory. That said, we don’t develop a fully-fledged theory in the book, but we do note at least two very different ways to go in developing a theory–a hard-line psychological account and a rougher normative or sociocultural account. Our sympathies reside with the latter, but there’s a whole spectrum of views available in between. We hope to map them out and put them to work elsewhere, but however you go with your theory of picture-reading, it will have a substantial impact on which artifacts will count as comics. (2) It’s crucial to note that the Intentional Picture-Reading View takes artifacts to be those things that are created with apt intentions for picture-reading count. That means, among other things, that not just any object that could be picture-read counts as a comic. Doubtless, we can try to picture-read non-comics and we might even succeed in some cases, but we deny that all picture-readable things are therefore comics–indeed, that’s partly why we’re committed to an intentionalist view. A rough parallel: comics are like crowns, not door stops. For something to be a crown, it needs to be created with a certain kind of intention. Door stops merely need to serve a function and most anything can be appropriated as a door stop. Quick moral: the creative intentions matter in ways that merely possible uses do not.

This second caveat is a significant one for marking our departure from Chris and for explaining our view. While Chris is surely correct to note that creative or authorial intentions are a thorny topic, we take them to be absolutely essential to our preferred account. Chris’s interesting “adjusted” definition elides intentions and so the resulting view is one on which anything that could in principle be picture-read is a comic. We think that yields a far too generous view for pretty much the reasons Chris notes. There are paintings that aren’t comics that look markedly similar to things that are comics and the difference between the two can’t be explicated in terms of things someone could do with either. (Each could be a door stop, after all.) But on the Intentional Picture Reading View, the fact that, say, Gahan Wilson intended his comics to be picture-read is part of what separates them from a piece of line art that is, say, merely intended to depict a barn on fire. So although someone *could* certainly attempt to picture-read paintings and other non-comic artifacts, they regularly and correctly do other things with them (e.g., looking at them in ways that treat text and sequence differently than we would in comics) and that if they are picture-read, that wouldn’t make them comics. Again, that’s because it’s the artifactual intention that matters. Importantly, that makes our view potentially quite narrow contrary to the “adjusted” view Chris sketches.

The final concern Chris notes–namely, how informatively we can characterize picture-reading and, in turn, how distinctive it really is–strikes us as *the* question for the Intentional Picture Reading View. We reject the generic view that would assimilate picture-reading to the tremendously broad act of looking at picture-based artifacts. For our part, we take picture-reading to be a specific practice essentially tied to phenomena like panels, text-image interaction, and grawlix. Accordingly, articulating a comprehensive and credible theory of picture-reading is *the* project at the heart of the philosophy of comics. And maybe the project for another book.

As I discussed in a previous post, I’m a big fan of Sam Cowling and Wesley Cray’s Philosophy of Comics. I think it may surpass Nathaniel Goldberg’s and my Superhero Thought Experiments. Keeping in mind that praise, I do object to their definition of comics—the first new definition I’ve seen presented by any comics scholars for several years.

Sam and Wesley (who I call by their first names since Sam and I know each other by email) offer what they term a “functional approach to defining comics,” one based on “a characteristic use as objects,” specifically that “comics are to be ‘read.’” This would mean that “comics are ultimately a functional artifact rather than one that can be defined formally or historically.”

Though I agree that comics overall cannot be defined formally or historically, I instead define the comics form and the comics medium separately, and then I use those two definitions to determine whether a given work is in one, the other, or both. This approach produces no general definition of comics. Let’s call that the homonym approach, since it treats ‘comics’ as a word with two non-exclusive meanings. Sam and Wesley follow the single-definition approach, calling their definition the ‘Intentional Picture-Reading View.’

What does it mean to ‘read’ a comic? They identify their use of the verb ‘read’ as a “linguistic accident,” because “whatever reading we do when we engage with comics, it is not the same activity as the reading we undertake when we engage with a novel.” Following Wertham, they call this distinctive kind of intended activity ‘picture-reading,’ and they characterize it is an “openness” to various “sociocultural practices” such as “incorporating one or more images into our unified attention,” “taking juxtaposed images as components of a narrative,” “finding closure among panels,” and “taking text (or a solitary image) as determining what’s true according to the narrative.” That produces the following comics definition: “x is a comic if and only if x is aptly intended to be picture-read.”

I think authorial intentions are an unnecessary and distracting topic, but rather than diving down a non-useful rabbit hole, I’ll adjust their definition to avoid it: “x is a comic if and only if x is perceived as aptly intended to be picture-read.” (Sam and Wesley seem to suggest something along this line through their later requirement that “competent audiences would be able to picture-read it and that competent audiences would recognize it as an attempt at producing something for picture-reading.”) The result is the same: something would be a comic because it is or can be “regarded with a certain kind of attention.”

Interestingly, they assert that this kind of attention also applies to single images, “since juxtaposed images … aren’t required for picture-reading,” just “an openness to incorporating juxtaposed images into one’s pattern of attention.” Does an openness to incorporating juxtaposed images into one’s pattern of attention require the physical presence of juxtaposed images? If so, then can viewing a single image produce it? Looking at the Mona Lisa or an installment of The Far Side does not involve the expectation of additional images entering one’s attention or an openness to taking juxtaposed images as components of a narrative or finding closure among them. It would likely exclude such things since each single image is understood instead to be a complete work. When other images are juxtaposed (on a gallery wall or on a newspaper comics page), those other images are likely not regarded with the same kind of attention and sociocultural practices as the perceptually isolated single image.

That suggests to me that single images are not picture-read in the sense that their definition requires. Since Sam and Wesley reject what they term the Deliberate Sequence View because it does not account for “the objection from single panel comics,” their Intentional Picture-Reading View could suffer similarly.

Alternatively, picture-reading does apply to single images. When I look at the Mona Lisa I am certainly open to incorporating the solitary image into my unified attention and to taking it as determining what’s true according to the narrative. It would seem then that any single image could be picture-read, or certainly any single-image narrative artwork. If so, then so many things become comics that ‘comics’ does not appear to differentiate a meaningful category of works.  

Returning to multiple images, Sam and Wesley examine the example of an art gallery owner hanging three paintings on a wall and then afterwards declaring that the three images are a comic. According to their definition, the three paintings are a comic if “a component comics reader recognizes” them as a comic, and the three paintings are not a comic if such a reader does not. They acknowledge that perceptions will likely vary, concluding: “This, we suspect, is where we ought to expect and therefore accept vagueness in a proposed definition of comics.”

While individual perceptions of most anything can vary, the primary vagueness here is not general to any proposed definition of comics, but only to those that rely on vaguely defined sociocultural practices. Perhaps such practices are inadequate for comics definitions. 

Regarding such pre-comics works as the Bayeux Tapestry, Sam and Wesley argue that “the sociocultural activity of picture-reading in its current form had not yet arisen when it was created” and “simply was not operative in [that] context.” Therefore the Bayeux Tapestry is not a comic, even though its creators “intended for it to be looked at and read in some sense.” What sense might those creators have intended their work to be read and how does that kind of picture-reading differ from the picture-reading that has the “distinctive history” required for a work to be a comic? Sam and Wesley state that for such things as the Bayeux Tapestry to not be comics “a distinction is needed between picture-reading as a historically specific activity and a more general, arguably universal activity of pictorial storytelling.” Since they do not offer such a distinction, it would seem then that Bayeux Tapestry might be a comic according to their Intentional Picture-Reading View.

What images are not comics according to that view? The potential range seems to include all single-image and multi-image art of any culture and time period.

What are the necessary and sufficient qualities of ‘picture-reading’? I suspect the term could be substituted with something like ‘comics-reading’ or ‘comics-medium-reading’ without a discernable change in meaning. If so, the proposed definition seems circular: “x is a comic if and only if x is aptly intended to be read as a comic.”

The above, and my previous blog too, are my tentative objections to the only two claims that I didn’t find immediately persuasive in all of Philosophy of Comics. Which is to say: it’s a pretty damn persuasive book.

But to be fair, I’ve invited Sam and Wesley to respond, and I will post their comments to my comments tomorrow.

Poet and literary critic Lesley Wheeler writes in her new hybrid essay collection Poetry’s Possible Worlds:

“The distinction between [fiction and nonfiction] rests not in intrinsic differences but on information external to the text. This story is on the front page of a trustworthy newspaper: factual. That one appears beside a moody illustration near the end of The New Yorker: you think ‘fiction’ and you assume references are invented or at least disguised and manipulated.”

My co-author Nathaniel Goldberg and I draw a similar philosophical conclusion in our Revising Fiction, Fact, and Faith: “A discourse is a fictional or factual diegesis if and only if read as that kind.” And we include an example that vacillates according to the kind of “information external to the text” that Wheeler mentions.

The New York newspaper The Sun published Richard Adams Locke’s “Great Astronomical Discoveries Lately Made by Sir John Herschel” in daily installments in August 1835. Though The Sun proved extremely untrustworthy, most readers read the story as factual—until they got to descriptions of bat-winged creatures building temples on the moon, and even then many believed the hoax until the newspaper announced it was fiction the following month. Had the story originally appeared in the New Yorker (though hybrid newspapers like The Sun may be the closest early nineteenth-century equivalent), or at least been identified within the publication as fiction, probably no one would have read it as a work of nonfiction. 

Note the point that Goldberg and I share with Wheeler: “Great Astronomical Discoveries” is not intrinsically fiction, even though its author wrote it as fiction. That’s because, in our shared view, the status of something being fiction or nonfiction is determined entirely by the experience of readers.

When Wheeler writes “you assume references are invented,” she means references to some world: “The effects readers experience as they enter possible worlds—such as transportation—don’t rely on the authors’ intent to mimic verifiable events, or, for that matter, to distort or ignore them entirely.” It’s about the world a reader imagines.

Goldberg and I discuss that too: “when a factual diegesis refers to the actual world, and a fictional diegesis refers to a merely possible one, each does so by reporting on its respective world.” Wheeler’s invented references are understood to report on a merely possible world, though she’s equally interested in the actual world. Both are kinds of possible worlds. Goldberg and I explain: “While there is only one actual world, which is itself possible, there is an infinity of merely possible worlds.”

Poetry’s Possible Worlds explores that infinity through the world-building enacted by readers of poetry.

Disproving Marie-Laure Ryan’s claim that a short lyric poem such as William Carlos Williams’ “This is Just to Say” is not “a system of reality,” Wheeler reports her experience of its reality: “I visualize an old-fashioned kitchen with an early-model icebox and linoleum table, the kind with a corrugated metal rim. Williams is wearing a light button-down shirt, cuffs open because it’s summer, and his posture is cocky.”

That’s not the kitchen I see, but the fact that I do see some kitchen (the one from my childhood home) proves her point. She continues: “As a reader, I may be an outer-limits case, yet where there is plot and character and sensory detail, imaginative world-building is possible.”

Later she describes a bar she imagined while reading another poem: “I mentally placed it … not in a pub I had really visited but in my Universal Fantasy Tavern. Like many people who seek to lose themselves in books, I recycle imagined settings to save attention for other elements of the work and speed immersion. None of this was conscious until I started researching the cognitive science of literary transportation, but I must have generated many of these spaces as an untraveled preteen.”

The term “transportation” is apt because it implies transportation to somewhere—though apparently never to the same place. Goldberg and I acknowledge this point in order to set it aside: “no two readers may read the same discourse in precisely the same way. Even so, typically there would reman overlap. If extensive, call the resulting diegesis ‘the diegesis.’”

Rather than setting them aside, Poetry’s Possible Worlds delves into those individual readerly worlds fully, revealing that “Taking Poetry Personally” (the title of the introduction) is an inevitability to be embraced rather than ignored. In the process, she also reveals the underappreciated fact that poetry relies on and produces the same levels of transportive and immersive world-building as longer works of fiction.

Poetry’s Possible Worlds is itself narratively immersive, merging a sequence of literary essays with a novel-like progression of short memoirs about not only the author’s reading experiences but the personal life experiences that surround them and give them context-specific meaning. I read early and multiple drafts of each chapter, but as I reread passages now, I am transported to events in my own life too.

The chapter exploring the thresholds of a poem by a poet she met while on a Fulbright in New Zealand includes the sentence: “Meanwhile, my husband, Chris, planned to work on a novel in our rented house.” My Alzheimer’s-suffering mother haunts the chapter on poetic “Fiction”: “Chris reproached himself for having missed so many signs, but Judy was smart enough to mask incapacity.” “Voice,” which explores another poet’s creative relationship with her husband, includes: “Chris and I are fascinated by literary couples.” But more revealingly: “You’d think Chris, whose first book was a novel, would believe in narrative. Yet since our first years together, Chris has resisted transforming real experience into tales.”

Do I co-write works of philosophy to avoid writing memoir? Possibly. But as Wheeler tells her readers: “All literature, however, even when it’s autobiographical, is fantasy.” Poetry’s Possible Worlds is one of my favorite works of fantasy, and not just because I’m married to the author or because the Acknowledgements and the book as a whole concludes on this sentence: “My real and imagined worlds are indebted to him.”

%d bloggers like this: