Pixel art (dotto-e in Japanese) is a form of visual representation. The term “pixel art” implies “retro game graphics” as it was the mainstream of video game graphics from the 1970s to the 1990s. On the other hand, in recent years, its characteristics has been widely accepted as a distinct graphic style that is somewhat “old and new.” In this article series, we introduce the features and attractiveness of pixel art as a form of representation, including these trends. The third part deals with the peculiar characteristics of pixel art as a form of representing characters and their actions in storytelling.
Index of Serials
As video game journalist IKEYA Hayato says, many people view pixel art graphics in old video games as something that “stimulates the imagination” because pixel arts involve a kind of omission that allows players to fill in the blanks with their own imagination. IKEYA himself disagrees with this common belief. He argues that, from his own experience, pixel arts present the appearance of characters exactly as they are, rather than providing room for the imagination. He notes, “Wouldn’t we naturally accept that Mario is just like that pixel art, and that he is adventuring in that kind of world?”1
While my own experience with retro games aligns with IKEYA’s, some people may have different experiences where their imagination is evoked by pixel arts. I am not here to debate which experience is correct or more common. Instead, I aim to show that pixel arts may also have some characteristics and appeal that go against the idea of “stimulating the imagination.”
In this article, I will explore the characteristics of pixel art as a form of narrative, specifically focusing on the representation of characters and their actions. To begin, I will briefly examine the various ways in which characters are represented in other forms of narrative.
Game researcher MOHRI Hitomi has outlined different ways to represent characters in traditional Japanese visual and performing arts, such as Noh (a stage by masked performers), kabuki (a stage by actors wearing special make-up), bunraku (a form of puppet theatre), emakimono, and so on. Based on her research, MOHRI has defined a framework for describing and classifying the various manners of character representation in narrative, and applied it to Japanese role-playing games (Table 1).2
MOHRI’s framework consists of the intersection of the following three perspectives:
Facial expression: Whether the character’s facial expression is explicitly represented.
Gesture: Whether the character’s gestures and limb movements are explicitly represented.
Dialogue: Whether the protagonist’s speech is explicitly represented by text or voice.
Since the third axis, “dialogue,” is not likely to be relevant to the characteristics of pixel art, here I will focus only on “expression” and “gesture.”
In kabuki, the characters’ expressions and movements are clearly conveyed to the audience through the actors’ performances. In contrast, emakimono (illustrated handscrolls) tends not to depict some characters’ facial expressions at all, specifically in a style known as hikime kagibana, meaning “slit eyes and hook nose.” According to MOHRI, Noh (a form of mask play) and bunraku (a form of puppet theater) possess both features; the characters’ movements are explicitly presented while their facial expressions are hardly displayed.3 Furthermore, there is another type of character representation, such as a kind of illustrations in novels, where facial expressions are explicitly portrayed but gestures are not.
As shown in the table, MOHRI applies the framework based on these three perspectives to the character representation in video games that focus on storytelling, specifically in Japanese computer RPGs, such as the Dragon Quest and Final Fantasy franchises. The table suggests that there are four types of character representation when focusing only on the combination of “facial expression” and “gesture” (A=C, B=E, D=G, and F=H), and that there are works of video games that belong to each of these types.
Whether or not one agrees with the examples classified, MOHRI aims to use this framework to demonstrate that these forms of narrative and video games vary in their ways to represent the characters’ emotions, and as a result also differ in terms of “humanness.” Here, “humanness” seems to be roughly defined as a sense of being a real person, that is, a kind of realism or verisimilitude in character representation. According to MOHRI, the degree of “humanness” influences how and the extent to which the audience imaginatively supplement the characters’ actual appearances.
As can be seen from the fact that she places some video games from the pixel art era, such as Final Fantasy V and Final Fantasy VI, in the type A, MOHRI does not view pixel arts to inherently lack “humanness.” In addition, she makes a good point that players do not necessarily imagine a realistic or real human-like appearance based solely on the image on the screen, but instead may imagine unrealistic one based on a complex relationship with paratextual materials such as cover art. Nevertheless, in the sense that she believes that the lack of explicit depiction encourages imaginative supplementation, it can be said that MOHRI also has an idea similar to the common “stimulation” theory.
In reference to MOHRI’s classification, I would like to examine the characteristics of pixel art as a method of character representation from a slightly different perspective. MOHRI’s framework categorizes various forms of narrative and video games based on how many explicit representations of facial expressions and gestures are present or how explicit they are. However, this point of view does not explain the peculiarity of pixel art because whether characters’ facial expressions or limb movements are explicitly portrayed is hardly relevant to its ability to represent them. In fact, she classifies video games with pixel art graphics as both the type A=C (dynamic and highly expressive) and the type F=H (static and minimally expressive). Additionally, although not listed in her table, we may also assume that there are examples that fall under the types B=E and D=G.
When considering the characteristics of pixel art, it is more appropriate to focus on how stylized such representations are. The term “stylized” briefly refers to “simplified,” “conventional,” or “having a constant pattern that one readily recognizes.” In general, for a representation to be stylized entails for it to be not realistic. From this perspective, it is obvious that pixel arts in retro video games tend to have a high degree of stylization. Take as an example Final Fantasy V (Square, 1992, Fig. 1),5 which is one of the earliest video games that drastically introduce the depiction of characters’ facial expressions and gestures in storytelling. In this game, while the main characters exaggeratedly change their countenance and flap their arms, they use only a few patterns that are simple and formulaic. Like emoticons and emojis, the facial expressions and limb movements of the characters are just depicted in a stereotypical and inflexible manner.
Final Fantasy V was followed by masterpieces of Japanese RPGs in the “16-bit” pixel art era, such as Secret of Mana (Square, 1993), Final Fantasy VI (Square, 1994), and Chrono Trigger (Square, 1995, Fig. 2). These works also employ similar methods of character representation as part of storytelling. Although they may have slightly more varied representation patterns of facial expression and gesture than Final Fantasy V, they are not much different in the degree of stylization.
These stylized representations seem to derive from the very nature of pixel art. As stated in the first part in this series, pixel arts have relatively few options for representation, at least compared to hand-drawn images, photographs, or physical performances, since they are composed of the units that are rather coarsely articulated to the point where each unit is distinguishable in the naked eye. Especially in the classical style of Japanese computer RPGs, the number of pixels used to depict each character’s facial parts and limbs is minimal, typically 2×2 or 3×3 at most. As a result, the representation of changes in faces and limbs is inevitably monotonous. Put briefly, it is a feature of pixel art, not a bug, that the representation of characters’ expressions and gestures is stylized.
It’s worth noting that many examples of characters with impassive faces can be found in video games rendered in 3D polygon graphics as well. In fact, as MOHRI properly includes them in the type D, video games in the early polygon era often depict characters as expressionless as, or even more so than, those in the pixel art era. However, what makes pixel art peculiar is that even when characters’ countenances are explicitly depicted, they are always stylized. Characters from the early 3D video games may indeed be expressionless, but if they do express emotion, it is likely to be (or at least can be expected to be) reasonably represented in a detailed and subtle manner. Pixel art characters, on the other hand, cannot have any nuanced change in facial expression in principle by virtue of its representational capability. Whether laughing or angry, blinkered or startled, they do nothing more than change one or two pixels in their faces. Pixel arts are required to use as little vocabulary as natural language, or even less, to represent a character’s countenance.
In terms of stylization, the comparison with other forms of narrative also reveals a different relationship than that indicated in MOHRI’s framework. In a conversation with SHIBUYA Kazuko, pixel artist Zennyan says:
I think that animations and performances presented by pixel arts are so memorable that I still like them now. . . . Since pixel arts are a bit symbolic, it is not possible to act characters out completely, to be sure, but it is really nice how characters can express happiness in a puppet-like way, with a little jumping and such. In addition, when incorporated into the story, these movements alone become so enjoyable and moving. When characters hang their head, they look really sad. I felt something unique to Japan about such representations. It is like Noh, where movements are so restricted. . . . Dance Is represented just by flapping their arms. I really like the kind of performance that evokes the imagination.6
Aside from the fact that Zennyan also seems to follow the “stimulation” theory, it is noteworthy that he compares pixel art to puppetry and Noh theater, and describes their appeal as “symbolic” and “restricted.” The quality that Zennyan refers to by these words may be almost identical to stylization in my terminology. I agree with Zennyan that pixel art is a highly stylized form of representation, and thus shares a kind of characteristics and appeal with Noh. Nevertheless, as I will argue in the following, I believe that they also have another aspect in common, which also arises from stylization but is the opposite of what Zennyan describes.
One of my favorite texts that I read again from time to time is Mujō To Iu Koto (On Being Impermanent) by prominent Japanese critic KOBAYASHI Hideo.7 It is a series of critical essays written during World War II, where he offers a unique interpretation of medieval Japanese literary classics and their aesthetics.
In one of the essays, titled “Taema,”8 KOBAYASHI interweaves his personal experience of watching a Noh stage with the aesthetics found in Zeami’s theory of theater, making a scathing comment on the modernist view of humanity, and perhaps of the arts based on it as well. My favorite passages from the essay are the ones in which he contrasts the serene yet robust attraction of masks and actors’ performances in Noh with the “ridiculousness” of contemporaries who were eager to read each other’s faces and minds. Let me quote at length:
What in the world was, what should I call, that two snow-white socks that briskly started moving with the sound of the flute?9
Why was everyone staring at that weird face [of the mask]? . . . I cannot doubt about that intense, ineffable impression; It cannot be that I have been tricked. . . . There are many faces in this hall, but none is interesting one that I cannot take my eyes off. What unsteady and dull looks are all of them? If I am not concerned about what kind of foolish face I am making in front of people, it means that no one can be responsible for their own face. And yet, we are trying to read each other’s facial expressions and find satisfaction. It is a ridiculous and pathetic thing. . . . “Take off your mask, look at others’ true minds.” It seems that modern civilization has started out with such babbles, without knowing where it would go.10
Correct the movements of ideas in accordance with the movement of the body, for the latter is far more subtle and profound than the former. I believe that he [Zeami] says so. I guess that if he were alive today, he would say that we should wear a mask to hide our stupid countenances that readily mimic shaky movements of ideas.11
What KOBAYASHI criticizes here is the view of emphasizing the “inner self,” “psychology,” or “emotion,” which are generally said to be “discovered” and have been valued by modernist literature, and encouraging to read them out of facial expressions. According to KOBAYASHI, Noh representations rather exclude the display of these “shaky” mental movements. Its highly stylized, extremely stripped-down performances, along with the “weird” Noh masks where it is often not clear what is modeled after, have a solid and powerful appeal. And this derives not from their ability to encourage the imagination of inner states, but from that to keep us from doing such a “pathetic” interpretation.
Although I am not sure what I should call the fascination that KOBAYASHI finds in Noh, it is my main claim in this article that the stylized representations of characters in pixel art have the attraction of the same kind. This view is opposite to the “stimulation” theory. Pixel art as a form of character representation has an aspect that makes us imagine nothing about the depicted character’s inner life, or more precisely, one that repels and cancels our attempts to imagine it. It shuts out the “stupid” attitude to be eager to interpret the “unsteady” psychology, showing off highly stylized faces and “brisk” movements and their own robust and intense beauty. To me, this is a crucial part of the appeal of pixel art as a narrative form.
To the Moon (Freebird Games, 2011, Fig. 3) is a successful example of video games that employs this “interpretation-repelling” characteristic of pixel art in harmony with the story. Although this game is made using RPG Maker XP and as a result has the appearance like classical “16-bit” Japanese RPGs, it is not an RPG but an adventure game, which has no combat or character development, focusing only on telling a rich story and solving puzzles of sorts.
The plot is roughly as follows. Eva ROSALENE and Neil WATTS, two doctors employed by Sigmund Corp., are working for fulfilling the wishes of dying persons in their own imagination. Eva and Neil use a special machine that can access a person’s brain to go into the patient’s old memories and “plant” the wish there. Then, in the client’s own mind, a new series of memories is re-created in according with that wish. And if all goes well, the wish is realized in the patient’s imagination, although it is in a sense fabricated. The current client is John WYLES, an elderly man whose wife has passed away, and his wish is very simple: “I want to go to the Moon.” However, the reason for it is unknown, so are the details of John’s past. As John is already on his deathbed when they arrive, the protagonists are not able to talk to him directly. Therefore, the player takes control of Eva and Neil and gradually traces John’s memories from the present to the past, trying to find out what happened in the past and why he wishes to go to the Moon.
John’s wife, River WYLES, is deceased at the beginning of the story, but she is alive in John’s memories and plays an important role in solving the mystery of his request. River is set as a character with a sort of developmental disorder, implied to be Asperger syndrome in the story, and is portrayed as a quiet person who does not express her intentions or emotions, and therefore has difficulties in social relationship. In fact, it is hard to say that the couple has good communication, at least in John’s memories.
One of the noteworthy aspects of this game is the depiction of River through pixel art. While other main characters, such as the two protagonists and John, have exaggerated and stylized facial expressions and gestures just like the characters in Final Fantasy V and Chrono Triger, River shows little to no emotional expression. There are only two instances in which River explicitly changes her countenance. One is when she is having fun. If you play the game, you will see that River looks really happy when she is smiling, as she is usually expressionless (Fig. 4). It means that detailed and complex facial expressions are not necessary to depict a scene in which a character who rarely shows emotion is truly enjoying. Rather, a monotonous change of facial expressions, like a change of masks, may be more suitable.
The other case where River’s expression changes is when she glances sideways at John’s responses (Fig. 5). In the first gameplay, it is hard to tell whether she is concerned, thinking, or plotting, but the player knows that she is looking at John anyway. Theses changes in gaze are impressively depicted with a few pixels. In the early part of the game, this countenance serves as giving the player a notion of River as an elusive and puzzling figure. However, as River and John’s past is gradually uncovered toward the end of the game, the reason for her sideways glances becomes retrospectively evident. Again, it is her pixelated face as stylized and reticent as a Noh mask, not a realistic face that will inevitably be eloquent, that creates this ambiguous yet robust effect.
To the Moon is open to various interpretations and evaluations, but at least to me, the game seems to contain a thought somewhat similar to KOBAYASHI’s. For modern people who read each other’s faces every day, it may be difficult to communicate with an expressionless person. However, it does not mean that there is no person behind an expressionless face, or that they care nothing about others. Access to others is possible in a way that does not involve eagerly trying to interpret “stupid” and “unsteady” facial expressions. To the Moon can be appreciated as a work that in an impressive manner presents this view through successfully combining the subject matter of the failure of communication with a partner having a developmental disorder, with the characteristics of pixel art as a form of representation.
In the three parts of this series, I have explored the characteristics and appeal of pixel art. Each article highlights that pixel art is not merely a product of technological limitations or a nostalgic relic, but rather a unique form of representation with its own features. Although at some time in the past, pixel art may have been a restriction in computer graphics, it is now one of the options for visual representation. The examples of modern pixel art referred to in the articles illustrate that artists deliberately choose it for their artistic purposes. Where we have choices, we also have the possibilities for artistic ingenuity and styles.
This series may have picked up just a few of the characteristics and attractions of pixel art. Other common topics around pixel art are also interesting from philosophical and aesthetic perspectives, such as the fact that pixel arts were created with considering their “blurring” when displayed on a CRT by an analog video transmission. There are many intriguing issues that have not yet been discussed in the philosophy of pixel art.
Translated by MATSUNAGA Shinji
*URL links were confirmed on February 13, 2023.