# Jukujikun: Kanji Compounds That Don't Read Like Their Parts

## Spelling, not arithmetic

Some kanji compounds aren't compounds at all — they're spelling.

[大人](/words/otona) reads *otona*, not *dai-jin* or *oo-hito*. [今日](/words/kyou) reads *kyō*, not *kon-nichi* (well, sometimes — see below). [紅葉](/words/momiji) reads *momiji*. [田舎](/words/inaka) reads *inaka*. The pronunciation is glued to the whole-word meaning, not assembled from the readings of the individual characters.

These are 熟字訓 ([jukujikun](https://en.wikipedia.org/wiki/Ateji#Jukujikun_and_ateji), "compound-word kun-readings"). They look like anomalies if you assume kanji-readings build words. They stop being anomalies the moment you remember [kanji](https://en.wikipedia.org/wiki/Kanji) were imported into a language that already had words. Jukujikun is the receipt.

### The opening canon

A short list every Japanese reader recognizes on sight. The phonology column is the load-bearing one — it explains what the spoken word actually is, independent of the characters glued to it.

| Surface | Kana | Romaji | Meaning | What's going on phonologically |
|---------|------|--------|---------|-------------------------------|
| [大人](/words/otona) | おとな | otona | adult | Native [wago](https://en.wikipedia.org/wiki/Wago) *otona* "grown one" predates the literate scribe; 大 + 人 chosen for sense, not sound |
| [今日](/words/kyou) | きょう | kyō | today | Yamato deictic *kepu* > *kyō*; 今 + 日 = "this day" by gloss |
| [昨日](/words/kinou) | きのう | kinō | yesterday | OJ *ki-no-pi* "the day before"; 昨 means "previous", 日 "day" |
| [明日](/words/asu) | あす | asu | tomorrow | OJ *asu* "morning, next day"; 明 "bright/dawn" + 日 "day" |
| [紅葉](/words/momiji) | もみじ | momiji | autumn leaves / maple | Verb *momidu* "to redden"; 紅 "crimson" + 葉 "leaf" gloss the meaning |
| [田舎](/words/inaka) | いなか | inaka | countryside | Compound of OJ *ina* "rice" + *ka* "place"; 田 "paddy" + 舎 "dwelling" gloss the sense |
| [海老](/words/ebi) | えび | ebi | shrimp / lobster | Native *ebi*; 海 "sea" + 老 "old (bent)" describe the body, not the sound |
| [太刀](/words/tachi) | たち | tachi | long sword | Native *tati* (cf. verb *tatsu* "to stand/strike"); 太 "thick" + 刀 "blade" |
| [大和](/words/yamato) | やまと | yamato | ancient Japan / "great harmony" | Place name *yamatə* (mountain-gate region); 大 + 和 added late as honorific spelling |
| [山車](/words/dashi) | だし | dashi | festival float | Verb *dasu* "to put out / pull out"; 山 "mountain" + 車 "cart" describe the float |
| [雪崩](/words/nadare) | なだれ | nadare | avalanche | Verb *nadaru* "to slide down"; 雪 "snow" + 崩 "collapse" gloss the event |
| [五月雨](/words/samidare) | さみだれ | samidare | early-summer rain | OJ *sa-midare* "rice-planting rain"; 五月 "fifth month" + 雨 "rain" gloss the season |
| [二十歳](/words/hatachi) | はたち | hatachi | twenty years old | Yamato numeral *hata-* "twenty" + age suffix *-chi*; 二十 + 歳 spell out the meaning |
| [七夕](/words/tanabata) | たなばた | tanabata | star festival (7 July) | OJ *tanabata* "loom-shelf maiden" (a deity); 七 + 夕 "seventh evening" date-gloss the festival |
| [為替](/words/kawase) | かわせ | kawase | exchange / remittance | Verb *kawasu* "to exchange"; 為 "do" + 替 "replace" gloss the action |

In every row, the kana column is the historical fact. The kanji are a logograph chosen *after* the word existed, picked to encode the meaning rather than transcribe the sound. *Otona* is not made of *o-* and *-tona* the way *gakkō* is made of *gaku* + *kō*. It is one indivisible Yamato word with a two-character spelling.

### The mechanism

[Japanese](https://en.wikipedia.org/wiki/Old_Japanese) had a spoken vocabulary for centuries before kanji arrived from the Korean peninsula in roughly the 5th century CE (Seeley 1991). When the scribal class began writing Japanese — not Chinese — they faced a forced choice for every native word:

1. **Pick a kanji whose Chinese sound resembled the Japanese word.** This is [ateji](https://en.wikipedia.org/wiki/Ateji) (当て字, "applied characters"). 寿司 → *sushi* is the classic case: 寿 reads ジュ/ス, 司 reads シ; together they ad-hoc-spell *su-shi*, with semantic flavor (寿 "longevity", 司 "to manage") as a bonus rather than the load-bearing layer.
2. **Pick kanji whose Chinese meaning matched the Japanese word, and ignore the sound entirely.** The reader sees the characters, recognizes the meaning, and supplies the native pronunciation from memory. This is jukujikun.
3. **Coin a new on-reading compound from Sino-Japanese morphemes.** This is the productive [kango](https://en.wikipedia.org/wiki/Sino-Japanese_vocabulary) mechanism (学校 *gakkō*, 電話 *denwa*).

Path 2 is the load-bearing claim of this post. The kanji function as a logograph for the whole word; the phonology is a wago survival, untouched by the borrowing. Habein (1984) traces this layered orthography back to the Nara-period *[Kojiki](https://en.wikipedia.org/wiki/Kojiki)* (712) and *[Man'yōshū](https://en.wikipedia.org/wiki/Man%27y%C5%8Dsh%C5%AB)* (c. 759), where the same scribes wrote the same word three different ways within a single text — sometimes phonetically, sometimes semantically, sometimes as jukujikun — depending on register and prosody.

### Jukujikun vs ateji vs kango

These three are routinely confused. The crisp distinction is which of {sound, meaning, both} the kanji choice is tracking.

| Type | Kanji track... | Reading source | Example | Decomposition |
|------|----------------|----------------|---------|---------------|
| **Phono-semantic compound (kango)** | sound + meaning compositionally | Sino-Japanese morphemes assembled | 学校 *gakkō* | 学 *gaku* "learn" + 校 *kō* "school" |
| **Native compound (wago)** | meaning compositionally | native morphemes assembled | 山道 *yamamichi* | 山 *yama* "mountain" + 道 *michi* "road" |
| **Ateji (phonetic)** | sound only | the loaned/native word, sound-spelled | 寿司 *sushi* | 寿 ≈ *su*, 司 ≈ *shi* |
| **Jukujikun** | meaning only, holistically | the native word, ignored phonologically | [大人](/words/otona) *otona* | 大 + 人 = "adult" as a unit; sound is wago survival |

Joyce (2002) frames this as a difference in which side of the orthography-phonology mapping the morpheme lives on. In kango, a single kanji is a morpheme — a sound-meaning pair. In jukujikun, the *compound* is the morpheme: indivisible on the phonology side, even though it occupies multiple character slots on the orthography side. Tomita (1989), surveying ateji in Edo-period vernacular literature, distinguishes "phonetic ateji" (sound-driven, like 寿司) from "semantic ateji" — and the latter is essentially modern jukujikun under a different label. The terminological boundary kept getting renegotiated into the early 20th century.

The practical test: if you can swap out one character and the reading of the remaining character stays put, it is a compositional compound. If swapping one character forces the whole reading to change because the unit was the gestalt — you were looking at jukujikun.

### Why so many are calendar and agricultural words

Run through any list of jukujikun and a pattern jumps out. They cluster on time, weather, agriculture, fauna, and place names.

| Domain | Examples | Why |
|--------|----------|-----|
| Calendar | [今日](/words/kyou), [昨日](/words/kinou), [明日](/words/asu), [七夕](/words/tanabata), [二十歳](/words/hatachi) | Time-deixis is core vocabulary; yamato words for "today/yesterday/tomorrow" predate writing by millennia |
| Weather | [五月雨](/words/samidare), [雪崩](/words/nadare), 時雨 *shigure*, 吹雪 *fubuki* | Climate vocabulary precedes literacy and resists replacement |
| Agriculture | [田舎](/words/inaka), [紅葉](/words/momiji), 早苗 *sanae*, 稲妻 *inazuma* | Rice-cycle vocabulary is the deepest stratum of Yamato Japanese |
| Fauna | [海老](/words/ebi), 海月 *kurage*, 河豚 *fugu*, 百足 *mukade* | Native folk taxonomy; kanji written as descriptive glosses |
| Place / kin | [大和](/words/yamato), 田舎, 海原 *unabara*, 父さん *tōsan* | Toponyms and kin terms are linguistic fossils |

Not coincidence. Frellesvig (2010, ch. 1) classifies these as the deep-stratum native lexicon — morphemes attested in Old Japanese (8th century) and reconstructable into [proto-Japonic](https://en.wikipedia.org/wiki/Proto-Japonic_language). They were already entrenched, prosodically irregular, and resistant to reanalysis when Chinese characters arrived. [Vovin](https://en.wikipedia.org/wiki/Alexander_Vovin) (2010), reconstructing Old Japanese from *Man'yōshū* phonograms, shows that words like *yamatə*, *ki-no-pi*, *sa-midare*, and *tanabata* were stable centuries before any scribe needed to write them down. When literacy came, the path of least resistance was to assign kanji as semantic glosses and leave the spoken word alone.

The strongest predictor of jukujikun status is not orthographic. It's lexical: *deep-stratum native vocabulary, especially in semantic domains that predate Chinese cultural influence*. Compounds for things the Chinese imports actually introduced — bureaucracy, Buddhism, scholarship, technology — went the kango route. Compounds for things the Yamato had already named for centuries went jukujikun.

### Fossils

Because the spoken layer is a wago survival untouched by Sino-Japanese phonology, jukujikun preserve archaic readings that have otherwise dropped out of the language. They are linguistic fossils. Historical phonology lives off them.

| Modern jukujikun | OJ / proto-form | Note |
|------------------|-----------------|------|
| [大和](/words/yamato) *yamato* | OJ *yamatə* (with reduced central vowel ə) | The final *-to* preserves an OJ ə that merged into *o* in most modern words; here it is locked in by the spelling |
| [田舎](/words/inaka) *inaka* | *ina* "rice (paddy)" + *ka* "place / locale" | *ina-* is the same morpheme behind *inazuma* "lightning" (lit. "rice-spouse") and *inaho* "ear of rice"; freestanding *ina* is otherwise extinct |
| [今日](/words/kyou) *kyō* | OJ *kepu* < proto *ke-pi* "this day" | The *-pu* > *-fu* > *-u* shift is regular; *kepu* is attested in the *Man'yōshū* |
| [明日](/words/asu) *asu* | OJ *asu* / *asita* "morning, next day" | The doublet *asu* (formal/poetic) vs *ashita* (everyday) preserves an OJ register split |
| [七夕](/words/tanabata) *tanabata* | *tana-bata* "shelf-loom" — the loom-maiden deity | The free morpheme *tana* "shelf" survives; *hata* "loom" survives; the compound noun is otherwise dead |
| [紅葉](/words/momiji) *momiji* | OJ verb *momidu* "to redden/crimson" → nominalized *momidi* | Conjugation class lost; the noun is the only surviving form of the verb root |

Vovin (2003), reconstructing Classical Japanese morphology, treats jukujikun as one of the most reliable sources of pre-Heian phonological evidence. The reasoning is mechanical: because the spelling is locked to the meaning rather than the pronunciation, the pronunciation is free to preserve forms that would otherwise have been overwritten by Sino-Japanese phonotactics. The orthography is conservative. The phonology is conservative. The result is a sealed time capsule.

### The Man'yōshū as ground zero

The 8th-century *[Man'yōshū](https://en.wikipedia.org/wiki/Man%27y%C5%8Dsh%C5%AB)* (万葉集, "Collection of Ten Thousand Leaves") is where the trichotomy first becomes legible. In the same poem — sometimes the same line — a scribe might write Japanese using kanji-as-phonogram (the *man'yōgana* that became kana), kanji-as-semantic-gloss, and kanji-as-jukujikun. Cranston (1993, *A Waka Anthology, Volume One: The Gem-Glistening Cup*) gives parallel transliterations on facing pages to make this layering visible.

![Page from the Genryaku-bon manuscript of the Man'yōshū, an 11th-century copy of an 8th-century Japanese poetry anthology](https://upload.wikimedia.org/wikipedia/commons/thumb/a/ae/Genryaku_Manyosyu.JPG/800px-Genryaku_Manyosyu.JPG)
*Genryaku-bon (元暦校本) manuscript of the* Man'yōshū, *an 11th-century copy collated in 1184 of the 8th-century original. The Nara-period scribes who compiled it wrote Japanese with no native script — they shuttled freely between phonographic kanji (man'yōgana), semantic kanji, and jukujikun within a single poem. This is where the three orthographic strategies were negotiated in real time. Source: [Wikimedia Commons](https://commons.wikimedia.org/wiki/File:Genryaku_Manyosyu.JPG).*

The deeper background — how kanji even arrived on the archipelago, and why the scribal class needed three competing strategies at all — is in our [history of kanji from oracle bones to the three scripts](/blog/history-of-kanji-oracle-bones-to-three-scripts). For this post the relevant fact is that jukujikun is not a 20th-century pedagogical curiosity. It is a 1,300-year-old orthographic practice, present at the moment Japanese became a written language.

### The reform threats, and why jukujikun survived

Postwar language-policy reformers, working under the *tōyō kanji* (当用漢字, 1946) and later *[jōyō kanji](https://en.wikipedia.org/wiki/J%C5%8Dy%C5%8D_kanji)* (常用漢字, 1981, revised 2010) lists, took a fundamentally compositional view: each kanji should have a small fixed set of readings, and compounds should be assembleable from those. Jukujikun violates this principle by definition — the compound's reading is not derivable from the parts. Several reformers in the 1946–1956 period proposed retiring jukujikun outright, replacing them with all-kana spellings (おとな for 大人, きのう for 昨日) or with phonetic ateji (Twine 1991, ch. 5).

Three reasons they survived:

| Reason | Mechanism |
|--------|-----------|
| **Frequency** | Words like 大人, 今日, 明日, 田舎 are in the top few thousand of any frequency list. Re-spelling the most common words in the language was a non-starter. |
| **Pedagogy** | Elementary curricula could absorb a small list of "special readings" (特別な読み, *tokubetsu na yomi*) as exceptions, taught by rote rather than rule — the same way English teaches "knight", "though", "colonel". |
| **Furigana** | Writers could (and do) annotate jukujikun with [kana ruby](https://en.wikipedia.org/wiki/Furigana) above the kanji whenever ambiguity threatens. The cost of the exception is amortized across the writing system rather than borne by the lexicon. |

Gottlieb (1995, *Kanji Politics*) argues that the survival of jukujikun is one of the clearest signals that postwar Japanese script reform was incremental, not revolutionary: every proposal that demanded users *unlearn* high-frequency forms failed, while proposals that simply pruned low-frequency kanji largely succeeded. The pedagogical handling — which kanji get taught when, and how exceptions are introduced — is laid out in our post on the [kyōiku kanji and how Japanese children learn them](/blog/kyoiku-kanji-how-japanese-children-learn).

### The 247 official jukujikun on the Joyo list

The 2010 revision of the *[jōyō kanji-hyō](https://en.wikipedia.org/wiki/J%C5%8Dy%C5%8D_kanji)* (常用漢字表) does not just list 2,136 characters with their officially-sanctioned on/kun readings. Appendix 2 (付表 *fuhyō*) lists exactly **247 jukujikun** and special-reading compounds explicitly recognized as standard Japanese, taught in school, and acceptable in government writing. They get their own appendix because they cannot be derived from the main reading tables. The appendix is the official acknowledgment that compositionality breaks here.

A representative sample:

| Surface | Kana | Romaji | Meaning | Joyo status |
|---------|------|--------|---------|-------------|
| 明日 | あす | asu | tomorrow | Appendix 2 |
| 大人 | おとな | otona | adult | Appendix 2 |
| 今日 | きょう | kyō | today | Appendix 2 |
| 昨日 | きのう | kinō | yesterday | Appendix 2 |
| 田舎 | いなか | inaka | countryside | Appendix 2 |
| 紅葉 | もみじ | momiji | maple, autumn leaves | Appendix 2 |
| 七夕 | たなばた | tanabata | star festival | Appendix 2 |
| 二十歳 | はたち | hatachi | 20 years old | Appendix 2 |
| 五月雨 | さみだれ | samidare | early-summer rain | Appendix 2 |
| 雪崩 | なだれ | nadare | avalanche | Appendix 2 |

The full list is published by the Agency for Cultural Affairs at the [Bunkachō 常用漢字表 page](https://www.bunka.go.jp/kokugo_nihongo/sisaku/joho/joho/kakuki/14/tosin02/index.html). Words not in Appendix 2 are not "wrong" — 海老 *ebi*, 太刀 *tachi*, 山車 *dashi*, 為替 *kawase* are entirely standard. They are just not on the list of jukujikun the government commits to teaching every student. Appendix 2 is the floor, not the ceiling.

### Edge cases: one surface, multiple readings

Most jukujikun coexist with a kango reading of the same characters. The choice between them encodes register, formality, or domain.

| Surface | Reading 1 | Reading 2 | Reading 3 | What flips it |
|---------|-----------|-----------|-----------|---------------|
| 今日 | [きょう (kyō, jukujikun)](/words/kyou) — everyday | こんにち (konnichi, kango) — formal "the present day" | — | Register: news/letters use konnichi; speech uses kyō |
| 明日 | [あす (asu, jukujikun)](/words/asu) — poetic / formal | あした (ashita, wago) — everyday | みょうにち (myōnichi, kango) — very formal | Register and prosody (asu in waka, ashita in conversation) |
| 一日 | ついたち (tsuitachi, jukujikun) — "1st of the month" | いちにち (ichinichi, kango) — "one day (duration)" | いちじつ (ichijitsu, kango) — archaic literary | Calendrical vs durational meaning |
| 大人 | [おとな (otona, jukujikun)](/words/otona) — "adult" | だいにん / たいじん (dainin / taijin, kango) — "important person" | — | Sense: ordinary "adult" vs honorific "great person" |
| 紅葉 | [もみじ (momiji, jukujikun)](/words/momiji) — "maple, autumn leaves" | こうよう (kōyō, kango) — "the leaves turning color" (verb-like) | — | Object (the tree/leaf) vs event (the seasonal change) |
| 山車 | [だし (dashi, jukujikun)](/words/dashi) — "festival float" | さんしゃ (sansha, kango) — not used for floats | — | Domain: festival vocabulary locks the jukujikun reading |

Backhouse (1993) frames this as register stratification. The kango reading is the academic / formal / Sinitic option; the jukujikun is the native / everyday / lived-in option. Same surface, two different sociolinguistic registers, distinguished only by which reading the speaker activates. One of the cleanest demonstrations in any writing system that orthography and phonology are separable layers.

### How this dictionary marks jukujikun

Every word here has a `word_formation_type` (kango / wago / jubako / yuto / jukujikun / mixed), and every per-character link in a word carries a `reading_type` (onyomi / kunyomi / nanori / **jukujikun**). For jukujikun words the per-character `pronunciation` field is intentionally **nil** — the kanji does not contribute a phonetic value. Only the whole-word reading is meaningful. The data model is the same shape as the linguistic claim.

You can see this on the show pages directly:

- [/words/otona](/words/otona) — `word_formation_type: jukujikun`. The kanji [大](/kanjis/5927) and [人](/kanjis/4eba) link in with `reading_type: jukujikun` and `pronunciation: nil`.
- [/words/momiji](/words/momiji) — same pattern with [紅](/kanjis/7d05) and [葉](/kanjis/8449).
- [/words/inaka](/words/inaka) — [田](/kanjis/7530) and [舎](/kanjis/820e) marked jukujikun.
- [/words/tanabata](/words/tanabata) — [七](/kanjis/4e03) and [夕](/kanjis/5915), reading bound to the whole. ([Tanabata](https://en.wikipedia.org/wiki/Tanabata) is the loom-maiden festival on 7 July.)
- [/words/kyou](/words/kyou) — [今](/kanjis/4eca) and [日](/kanjis/65e5), with the kango doublet *konnichi* listed for contrast.
- [/words/asu](/words/asu) — [明](/kanjis/660e) and [日](/kanjis/65e5), with *ashita* and *myōnichi* both linked.

On a kanji's own show page, words are grouped by reading type. The kunyomi block shows kanji that *contribute* their kun reading to a compound; the jukujikun block shows compounds where the kanji is just spelling. The visual distinction is intentional. Phonological arithmetic and orthographic logogram are different kinds of facts.

Word-formation typology more broadly — kango, wago, the two mixed types (jubako and yuto), and jukujikun — has its own dedicated post: [word formation types: kango, wago, and jukujikun](/blog/word-formation-types-kango-wago-jukujikun).

Jukujikun is what happens when a writing system is glued onto a language that already had centuries of words. Most of the time the glue is invisible — on-readings flow compositionally, kun-readings flow compositionally, and the system pretends to be modular. Jukujikun is where the pretense drops. The compound is one Yamato word with a two-character spelling, the kanji are along for the ride, and the only thing the reader can do is recognize the gestalt. For a system that gets accused of being unprincipled, this is one of the few places it actually is.

### Further reading

Internal:

- [On'yomi vs Kun'yomi: a practical guide](/blog/onyomi-vs-kunyomi-practical-guide) — the four historical strata of Sino-Japanese readings that jukujikun routes around.
- [History of kanji: oracle bones to the three scripts](/blog/history-of-kanji-oracle-bones-to-three-scripts) — the *Man'yōshū* and the moment Japan got a writing system.
- [Kyōiku kanji: how Japanese children learn](/blog/kyoiku-kanji-how-japanese-children-learn) — how the school system handles jukujikun as exceptions.

External:

- Seeley, C. (1991). *A History of Writing in Japan*. University of Hawai'i Press. The standard English-language history; chs. 2–4 trace the layered orthography from Nara to Heian.
- Frellesvig, B. (2010). *A History of the Japanese Language*. Cambridge University Press. Ch. 1 on the deep-stratum native lexicon; ch. 6 on Sino-Japanese borrowing.
- Vovin, A. (2003). *A Reference Grammar of Classical Japanese Prose*. RoutledgeCurzon. Treats jukujikun as a primary source of pre-Heian phonological evidence.
- Habein, Y. S. (1984). *The History of the Japanese Written Language*. University of Tokyo Press.
- Twine, N. (1991). *Language and the Modern State: The Reform of Written Japanese*. Routledge. Chs. 4–5 on the postwar reform debates over jukujikun and ateji.
- Bunkachō (2010). [常用漢字表 (joyo kanji table)](https://www.bunka.go.jp/kokugo_nihongo/sisaku/joho/joho/kakuki/14/tosin02/index.html). Appendix 2 is the official jukujikun list.

