Doc Type: Partially documented proposal
Title: Proposal to include Chinook Pipa script in UCS
Source: Van Anderson
Status: For public review.
Replaces: Proposal v. 4.0
Action: For review by community. Comments to Van Anderson
Date: 2009-04-26
Discussion list: Chinook in the UCS


Overview

The Chinook Pipa script, an adaptation of the Duployéan shorthand by father Jean Marie Raphael LeJeune, is an historic script used for writing the Chinook Jargon and other languages of interior British Columbia. Its original use and greatest surviving attestation is from the run of the Kamloops Wawa, a (mostly) Chinook Jargon newsletter of the Catholic diocese of Kamloops, British Columbia, published 1891-1923. At the time, the Chinook Jargon trade language was spoken in an area encompassing SE Alaska, most of British Columbia, Washington State, western Montana, Oregon, Idaho, and far northern California, and although the Chinook Jargon was the lingua franca in many communities, it was generally a spoken, rather than written language. Most attempts at documentation used the Latin script to approximate Jargon phonology, and indeed, dictionaries of the Chinook Jargon are still readily available in these Latinate orthographies. In contrast, the archives of the Kamloops Wawa, written in Chinook Pipa, includes a considerable dictionary, but also constitutes a 3+ decade corpus of Chinook Jargon usage, during the height of its spread and utility. There currently exists no formal encoding, in any context, for the representation of the Chinook Pipa, and the only informal representation is transliteration by means of the Latin orthographies used in writing the Chinook Jargon. Indeed, the submission of the Chinook Pipa script to UCS has necessitated the creation, from scratch, of the first Chinook typeface, such an effort currently underway with glyph images available for review.

Structure

Chinook Pipa contains several classes of letters, differentiated by visual form - hence script function - and phonetic value. Letter classes include the line and arc consonants, circle vowels (A and the O/W vowels), nasal vowels, arc vowels, H and X. Vowels are further classified by their compounding behaviour. Since Chinook Pipa is an adaptation of a shorthand system, strings of letters are intended to join together cursively to form nominally syllabic units. This syllabic joining is generally algorithmic, but alternate syllable formation is quite commonly inherited from source languages and requires manual encoding or a form of spell checking. Most Pipa letters have variant forms, including the addition of ancillary dots, compounding of vowels, and overlapping concatenated behaviors for initialisms and abbreviations. Excepting for reverse stroke direction of some letters, Chinook Pipa is written LTR, syllable by syllable, in horizontal lines proceeding down the page, as with most European scripts.

Ordering

Ordering of the characters in the Chinook Pipa is undefined - the only lexicon using the script cites nominally in Latin alphabetical order - so allocation order in the Chinook Pipa Character Block is revisable up to inclusion in the standard. Essentially, a Unicode Standard that includes a Chinook Pipa Character Block will be the only official ordering of the script. The currently proposed allocation ordering and its basis is as follows: According to Father LeJeune's Chinook Rudiments, the characters encoded x00-x09 (,,,,,,,,, and ) double as the numbers 1-9&0. x09 & x0A () constitute the next basic vowels given in his introduction. x0B () is another simple vowel with a related variant form 16 code points later. x0C & x0D ( & ) round out the basic vowels given in LeJeune's repertoire, while x0E () is the last simple (non-compound) vowel in the Chinook Pipa. x0F is a combining mark, used to indicate a Salish letter (by modifying U) and the modified arc consonants Ng, Ch/J, and Ts, and a variant, possibly glottalized, consonant. The second column of the allocation begins (x10-x14: ,,,, and ) with the voiced counterparts (elongated form) of the first five consonants (x00-x04). x15 () is the last simple consonant and x16 () is the similarly acting and phonologically related letter 'X' used in writing Salishan languages. x17-x1A are reserved, ostensibly, for any new Salish-specific letters discovered in the corpus of handwritten texts left unstudied. x1B is the sister character to x0B. These two characters often have different orientations, being just turned or mirrored versions of each other, and distinct conjoining properties, but are identical in certain environments. x1C and x1D are again reserved code points. The second column is rounded out with the Chinook Pipa Full Stop (, x1E) and the Virama-like Chinook Pipa Concatenator (x1F) which encodes for abbreviations and similar constructions in the script. In the last column come the Nasal Vowels, x20-x23 (,,, &) that only intermittently appear in the Wawa texts, but are neither composed characters nor variants. Last of all is the logograph /likalisti/ () meaning eucharist at x24. All further code points x25-x2F are again reserved, this time for any other logographs encountered in Pipa texts.

Alphabetization

No information is available on alphabetization, as the dictionary portions of the Chinook Rudiments text are given in roughly Latin alphabetical order. Other sources group words by novel alphabetization, no more or less canonical than any other. The most logical ordering, given the structure of the script, would be along the lines of P, B, T, (Th) D, (Dh) F, V, K, (Kh) G, L, (Lh, hL) R, (Rh) M, N, (Ng) Sh, (Ch), S, (Ts), O, (W+O+ vowels), (O+ vowels) A, (Wa), I, E, (Wi), (Wi+ vowels) Oo, (Woo), Ow, (Wow) (Ow+ vowels), U, H, (X) then An, In, On, and Un. Given that alphabetization is not a defined property of the Unicode Standard, it would seem that the above or simple binary order would more than suffice for any implementation needing an order of alphabetization.


Chinook Pipa Codechart v.4.1


Previous Version
Glyphs Character List
 U+x00  U+x10  U+x20 
0
1
2
3
4
5
6
7
8
9
 A 
B
C
D
E
F
Short Line Consonants
00CHINOOK PIPA LETTER P
· number 1
01CHINOOK PIPA LETTER T
· number 2
02CHINOOK PIPA LETTER F
· number 3
03CHINOOK PIPA LETTER K
· number 4
• written down and to the left
04CHINOOK PIPA LETTER L
· number 5
• written up and to the right

Arc Consonants
05CHINOOK PIPA LETTER M
· number 6
06CHINOOK PIPA LETTER N
· number 7
07CHINOOK PIPA LETTER SH
· number 8
08CHINOOK PIPA LETTER S
· number 9

Simple Vowels
09CHINOOK PIPA LETTER O
· number 0
· Compound Base Vowel
· Compounding Vowel
· Circle vowel
0ACHINOOK PIPA LETTER A
· Compounding Vowel
· Circle vowel
0BCHINOOK PIPA LETTER I
· non-Compounding vowel
x1B chinook pipa letter E
0CCHINOOK PIPA LETTER OO
· Compounding Vowel
· Circle vowel
0DCHINOOK PIPA LETTER OW
· Compound Base Vowel
· Compounding Vowel
· Circle vowel
0ECHINOOK PIPA LETTER U
· non-Compounding vowel

Modifying Dot
0FCOMBINING CHINOOK PIPA MODIFYING MARK
· modifies N → Ng, Sh → Ch, S → Ts, K → K'
· abbreviated CCMM
· shape shown is not representative of all visual forms

Long Line Consonants
10CHINOOK PIPA LETTER B
11CHINOOK PIPA LETTER D
12CHINOOK PIPA LETTER V
13CHINOOK PIPA LETTER G
• written down and to the left
14CHINOOK PIPA LETTER R
• written up and to the right
Dot Consonant
15CHINOOK PIPA LETTER H
→ 00B7 middle dot

Additions for Salish
16CHINOOK PIPA LETTER X
· Voiceless velar/uvular fricative
17<reserved>
18<reserved>
19<reserved>
1A<reserved>

Additional Vowels
1BCHINOOK PIPA LETTER E
· Compounding Vowel
x0B chinook pipa letter I
1C<reserved>
1D<reserved>

Punctuation
1ECHINOOK PIPA PUNCTUATION FULL STOP

Concatenating Control Character
1FCHINOOK PIPA CONCATENATOR
· signifies abbreviations and initialisms
· shape shown is arbitrary and is not visibly rendered
→ 10A3F Kharoshthi Virama

Nasal Vowels
20CHINOOK PIPA LETTER AN
21CHINOOK PIPA LETTER IN
22CHINOOK PIPA LETTER ON
23CHINOOK PIPA LETTER UN

Logographs
24CHINOOK PIPA LIKALISTI SIGN
25<reserved>
26<reserved>
27<reserved>
28<reserved>
29<reserved>
2A<reserved>
2B<reserved>
2C<reserved>
2D<reserved>
2E<reserved>
2F<reserved>


Zero Width Non-Joiner and Zero Width Joiner

The Chinook Pipa script has complex cursive conjoining, overlapped concatenating, and nested compounding behaviours, and all are effected by the use of the Zero Width Non-Joiner and Zero Width Joiner. The Zero Width Joiner (ZWJ) will encode for conjoining behaviour (nominally syllabification) that would otherwise not exist in a given context (see Syllable Forming below). The Zero Width Non-Joiner (ZWNJ) encodes the override of conjoining, concatenating, and compounding behaviour that would normally exist in a given context. Except in the case of compounding vowels, a sequence of multiple ZWNJ/ZWJ characters between two Chinook Pipa characters acts as a single instance of the final character in the sequence. Any ZWNJ/ZWJ characters between a Chinook Pipa character and a character in any other block will have no effect on the Chinook Pipa character, and will effect the other character canonically.


Cursive Conjoining

The most common form of character interaction is that of the cursive connection. The termination of the stroke of an initial character leads directly into the beginning of the next character. Circle vowels are connected on their perimeter at a tangent. The vowels "I", "E", and "U" rotate to connect without angle to a preceding character and with minimal angle into following characters. The letter "Oo" connects with preceding characters as normal, but most often cursively connects with a following character at the nub. Rules for which characters can connect are given below at Syllable Formation.


Vowel Compounding

Compound vowels, with the exceptions of Wi and Wa, are exceedingly rare in Chinook Jargon works, and of only limited use in Chinook Pipa texts in other languages. Compound vowels (denoted Vv+) take the form of a Compound Base (denoted Vb), either "O" or "Ow", with the addition of Compounding vowels (denoted Vc) nested inside the base. Note that the Compound Base vowels are also Compounding vowels, and can nest inside each other. One of the Compounding non-Base vowels ("E") can cursively join inside the Compound Base, while others ("Oo" and "A") will complete a compound vowel unless overridden by ZWJ. Theoretically, there is no limit to the amount of compound nesting, but examples have not been found of more than two Compound Base vowels or two Compounding vowels in a single compound vowel. A non-Compounding vowel (denoted V0) always marks the end of a compound vowel, and joins that compound vowel as normal, ie "I" will cursively connect on the right, and "U" will form a new syllable. Vowel compounding can be overridden with the use of the Zero Width Non-Joiner the same as normal syllabic breaking (see Example 14 below). A compounding vowel can cursively conjoin with a Compound vowel or Compound base using the sequence V(+/b) + ZWNJ (breaking the compounding) + ZWJ (signaling conjoining) + Vc. The sequence V+ + ZWJ + ZWNJ + Vc, like other instances of ZWJ + ZWNJ is equivalent to V+ + ZWNJ + Vc.

Figure 1-1.Compound Vowel formation in Chinook Pipa
(1) Ob + Ac → Wa+
+
(2) Ob + Ec → Wi+
+
(3) Ob + Oc → Wo+
+
(4) Ob + Ooc → Woo+
+
(5) Ob + Owc → Wow+
+
(6) Ob + Obc + Ac → Ohwa+
+ +
(7) Owb + Ac → Owah+
+
(8) Ob + Ec + Ec → Weyi+
+ +
(9) Ob + Ec + Ac → Weeya+
+ +
(10) Ob + Oc + I0 → Wo+i *
+ +
(11) Ob + Obc + Ec → Ohwi+ *
+ +
(12) Ob + Obc + Ec + Ac + I0 → Ohwia+i *
+ + + +
(13) Ob + Ec + ZWNJ + ZWJ + Ec → Wi+eh *
+ + ZWNJ + ZWJ +
(14) Ob + ZWNJ + E0 → O.E
+ ZWNJ +
(15) Ob + Ac + ZWJ + Ec → Wai *
+ + ZWJ +
* These Compound Vowels are not known in the source texts, but are included for demonstration purposes.

It has been pointed out that the rules for vowel compounding present a complexity that could be aleviated by including precomposed characters for all compound vowels. While I am receptive to the concept, I believe that three factors make this an unsatisfactory model for encoding of Chinook Pipa. 1) the repertoire of compound vowels listed above, while considerable, may not represent the entirety of compound vowels found in Chinook Pipa texts, and the process of adding new characters to the block would present unwarranted constraints on the ability of scholars to exchange script usage data in the future. 2) The number of compound vowels would require expanding the current allocation space for the Chinook Pipa script, as well as increasing the estimated demand on allocation room in the future. 3) The current repertoire respects both an analysis of the compound vowels as segmental, and the Unicode Design Principle of encoding plain text.

Given that the best typefaces designed for Chinook Pipa will probably contain precomposed glyphs for most compound vowels, novel uses may have unknown representations. However, I believe that fallback rendering would probably better serve the needs of the mostly scholarly community that will make use of this script, and will be non-existent to the members of the general public, who will more than likely use the script for the representation of either English or colloquial Chinook Jargon, both of which are completely represented by the compound vowel forms included above.



Chinook Pipa Concatenator

Normally, Chinook Pipa letters conjoin or compound cursively or separate into syllables by algorithm. There is, however, a variant joining behaviour, in which adjacent line and arc consonants will overlap, signifying an abbreviation, initialism, or acronym (denoted CCx). The Chinook Pipa Concatenator (CPC: , U+x1F), signifies this alternate concatenating behaviour, much like the Virama in Indic scripts indicating conjunct letters. The CPC is interlocuted between the effected consonants, signifying the concatenating interaction of the two letters. A concatenated consonant cluster will conjoin cursively as normal with any preceding characters, and will break immediately aft unless overridden with the Zero Width Joiner. The Chinook Pipa Concatenator cedes to both ZWNJ and ZWJ, allowing non-standard abbreviations that may be composed of cursively conjoined or separated characters, but retaining the cursive joining properties of a concatenated sequence. Therefore, two Chinook Pipa letters interrupted by any combination of the CPC and either the Zero Width Joiner or Zero Width Non-Joiner will join together as if the CPC were not there, and join adjacent letters as if the ZWJ/ZWNJ were not there.

Figure 1-2.Concatenated Consonant formation in Chinook Pipa
(1) S + CPC + T → STx
+ +
(2) Sh + CPC + K → JKx
+ +
(3) S + CPC + B + CPC + Sh → SBShx
+ + + +
(4) I + T + CPC + S → I.TSx
+ + +
(5) Sh + CPC + ZWJ + K → J-K
+ + ZWJ +
(6) Sh + CPC + ZWNJ + K → J.K
+ + ZWNJ +




The Combining Chinook Pipa Modifying Mark

The Combining Chinook Pipa Modifying Mark (CCMM: , U+x0F) is used to denote variant letterforms of various Chinook Pipa letters. Even though the CCMM appears similar to an H or a general combining dot, it behaves distinctly from any of these other characters and has a more general appearance, usually a dot, but also as a "tic" or crossbar through a line consonant. New letterforms created with the CCMM often have a modifying mark away from the base letter stroke, unlike an H-digraph letter, and the CCMM is rendered above, below, to the left, or the right of the base character, depending on the orientation of a base letter - which can change contextually - unlike combining diacritics. Currently, the CCMM is attested in conjunction with the letters N, Sh, S, U, and K to represent Ng, Ch/J, Ts/Z, Uh or the labialized uvular Xw, and probably the glottalized velar K'. All instances of the CCMM in combination with the Zero Width Non-Joiner or Zero Width Joiner should act on the modified character exactly as it would following the base character without the CCMM. In other words, the CCMM should have no effect on the conjoining properties of its base character, and should be treated like any other combining diacritic mark. Proper Chinook Pipa typfaces would ideally have precomposed glyphs for most CCMM modified characters, as the mark has a distinct appearance in combination with different characters.
Figure 1-3.Letters with Combining Chinook Pipa Modifying Mark
(1) S + CCMM → Ts
+
(2) Sh + CCMM → Ch/J
+
(3) N + CCMM → Ng
+
(4) U + CCMM → Ŭ or /xw/
+
(5) K + CCMM → K'
+

It has been pointed out that a simpler alternate to the CCMM would be to include precomposed characters for all modified letters. I believe that three factors make this proposal unsatisfactory. 1) the repertoire of modified letters above, while considerable, may not represent all possible forms used by Chinook Pipa writers. It is conceivable, at the very least, that undocumented texts in "minority" languages could use the modifier mark on most of the line consonants in the inventory for the representation of glottalized forms, as with K, or even double marked arc consonants. The burden of adding any new character to the block would present considerable constraint on the ability of scholars to exchange and document this novel script usage. 2) The number of possible letterforms significantly increases the estimated demand on allocation room in the future, and would increase the probability of the current allocation needing expansion. 3) The current repertoire conforms with an analysis of the modified letters as behaviourally unified with their constituent base characters, the CCMM as fundamentally diacritic in nature, and respects UTC practice of not encoding decomposable characters except for compatibility with pre-existing standards.

Given that the best typefaces designed for Chinook Pipa will probably contain precomposed glyphs for most CCMM combinations, novel uses will have unknown representations. However, I believe that fallback rendering would probably better serve the needs of the mostly scholarly community that will make use of this script, and will be unknown to the members of the general public, who will more than likely represent either English or colloquial Chinook Jargon with this script, both languages being completely represented by the glyphs included above.



H-digraphs and combining behaviours of the letter H

The letter H ( U+x15) is normally found in isolation in Chinook Pipa texts, that is, it is rendered syllabically spaced from any other characters, without cursive connection. This occurs almost absolutely when preceded by a vowel or non-line consonant. However, when preceded by a line consonant - and in one known case, preceding - the letter H can also form a digraph to create an H-flavored variant of the consonant (denoted Ch or hC). These digraphs include Th, Dh, Kh, Lh, hL, and Rhh. These digraphs connect to surrounding letters as if the H were not present. The Zero Width Non-Joiner (ZWNJ, U+200C) will override Ch digraph creation as C + ZWNJ + h, or override combining behaviour with C + h + ZWNJ. The Zero Width Joiner (ZWJ, 200D), by extension encodes for the rare hL digraph (or others) with h + ZWJ + C. In the event of a sequence of Ch + C needing rendering as cursively conjoined, the sequence C + ZWJ + h + C should be used, to disambiguate from the C + hC sequence C + h + ZWJ + C.

Figure 1-4.H digraphs and ZWNJ
(1) T + H → Th
+
(2) D + H + I → DhI
+ +
(3) K + H → Kh
+
(4) L + H → Lh
+
(5) H + ZWJ + L → hL
+ ZWJ +
(6) R + H + H → Rhh
+ +
(7) L + ZWNJ + H → L.H
+ ZWNJ +
(8) P + ZWJ + H → Ph*
+
(9) I + H + T → I.H.T
+ +

It has been noted that a simpler alternate to H-combining behaviour would be to include precomposed characters for all H-digraphs. I believe that three factors make this an unworkable concept. 1) the repertoire of H-digraphs above, while considerable, may not represent all forms in the corpus of Chinook Pipa texts. It is conceivable that undocumented texts could have several currently unknown H-digraph glyphs. The burden of adding newly discovered characters to the block would present constraints on the ability of scholars to document script usage in the minority languages these digraphs could be found in. 2) The number of possible letterforms increases the estimated demand on allocation room in the future, adding significant ambiguity to the space allocated for future additions to the script. 3) The current repertoire respects both a script analysis of H-digraphs as segmental, and with the Unicode Design Principle of encoding plain text.

One alternate that does not suffer from the drawbacks of precomposed characters is the idea of consistently encoding all H-digraph forms with the Zero Width Joiner (as example 8 above) and to make the default conjoining behaviour of H consistently non-joining. While simplifying the work of typographers, this proposal would increase the demand on end users, as the H-digraphs Th, Dh, Kh, Lh, hL, and Rhh are more common than the consonant clusters T.H, D.H, K.H, L.H, H.L, and R.H.H.

Furthermore, the use of ZWJ/ZWNJ with H-digraphs maintains internal consistency with the script behaviour concerning syllable breaking expanded on below; ie a ZWJ encodes for co-syllabic behaviour, in this case, the digraph form, where it would otherwise not do so, and the ZWNJ encodes for syllabic breaking behaviour where it would otherwise not occur - as T.H, D.H, K.H, L.H, H.L, or R.H.H.

Lastly, the best typefaces designed for Chinook Pipa will probably contain precomposed glyphs for H digraphs. As the proposal now stands, novel digraphs will have varying represenations in different fonts. However, I believe that fallback rendering would probably better serve the needs of the mostly scholarly community that will make use of this script, and will be mostly unused by the members of the general public, who will more than likely use Chinook Pipa for the representation of either English or colloquial Chinook Jargon, both of which are completely represented by the glyphs included above. In the end, encoding H-digraphs segmentally gives the most flexibility to the community that needs to represent a large corpus of historical texts as accurately as possible, maintains transparency for the general user of the script, and presents few challenges to a typographer attempting to meet the needs of both these communities.



Combining diacritical marks on vowels

The Chinook Pipa script uses several combining diacritical marks, including an over-dot, underdot, diaeresis, and under diaeresis. The macron, under-macron, acute, and breve are also found in Salishan texts. These last four do not place directly above (or below) their base letter, but are instead shifted right, so their left-hand extreme is directly over the center of the base letter. The under macron has only been found in combination with acute, as some writers (mostly LeJeune) move the macron below a vowel to avoid collision with the acute placed above that vowel.



Nasal Vowels

The Chinook Pipa nasal vowels have a combining behaviour unlike any other characters. In certain circumstances, they take the form of a diacritic mark over the intersection of the two adjacent characters, and in others they will render inline, just as a regular letter. The nasal vowels will render displaced - as a diacritic - only if adjacent two regular consonants (not H or X), and the two consonants are not similar line consonants, ie the same angle. In all other circumstances, a nasal vowel will be rendered cursively connected to the adjacent consonant. ZWNJ will override displaced rendering by splitting the adjacent consonant into another syllable. A displaced nasal vowel will render below the intersection of adjacent consonants if room is not available above.
Figure 1-5.Nasal Vowel rendering
(1) D + An + S → DanS
+ +
(2) L + An + P → LanP
+ +
(3) H + An + D → HAnD
+ +
(4) S + I + V + In → SIVIn
+ + +
(5) A + I + L + An + D → AILanD
+ + + +
(6) A + I + L + ZWNJ + An + D → AIL.AnD
+ + + ZWNJ + +



Other Characters

The other characters in the Chinook Pipa - the letter "X", Full Stop, and "Likalisti" sign - do not typographically interact with other letters. The letter "X" acts like a non-digraph "H" and splits syllables fore and aft. The Chinook Pipa Full Stop character is used fairly frequently like a period or colon, probably due to these punctuations' similarity to Chinook Pipa letters. The logograph "Likalisti", meaning eucharist



Vowel Orientation

Chinook letters generally combine in syllabic groups according to a fixed algorithm. All consonants have a stroke direction - for P/B, F/V, K/G, M/N, and all variants, the stroke direction is top-down; for T/D, L/R, Sh/S, and variants, stroke direction is left to right. Consonants join with the stroke termination of the first consonant marking the beginning of the second consonant's stroke. Consonants, I, and E join to circular vowels and circular vowels to consonants, I, and E at tangent angles - in the original source materials, the circles are actually continuations of the consonant strokes moving into and out of the circular vowel form. Vowels often combine beneath and to the right of consonants, but generally above for the pattern T/D/L/R preceding a circle vowel plus S/Sh/N/P/B/K/G, or the pattern L/R + circle vowel + T/D. Circle vowels usually combine inside arc consonants. I and U almost exclusively follow the "in from the top or left, out down or right" rule, except that E orients exactly opposite when joining a single letter, either P, B, T, D, F, or V. An isolated E is also known to render upside-down, such a distinction necessitating markup outside the scope of the Unicode Standard. These rules having been given, the down/right rule will always be intelligible, though less elegant than contextual implementations.




Syllable Formation

As mentioned above, Chinook Pipa letters cursively join together into nominally syllabic units. There are several rules for properly separating one syllable from another in the Chinook Pipa given below. All rules of syllabification can be overruled by ZWNJ and ZWJ. If ZWNJ is used to break a syllable, the adjacent letters should combine with the surrounding syllables as if the ZWNJ represented a word break. A ZWJ causes the conjunction of the two adjacent characters without any other effect. In other words, the adjacent syllables should form as if the ZWJ were not there.

The most important definition regarding syllable formation is that of a legal algorithmic consonant cluster. Legal algorithmic consonant clusters shall be of the following patterns 1) a labial plosive (P or B) followed by or following S or a liquid (L or R); 2) a dental plosive (T or D) followed by or following S/liquids or preceding consonant I (an I preceding A, O, I, or E); 3) a labio-dental fricative (F/V) followed by liquids, dental plosives, or velar plosives; 4) a velar (K/G) followed by or following S or liquids or preceding consonant I; 5) S followed by plosives or liquids, or a legal consonant+S cluster followed by a plosive or liquid; 6) Sh followed by or following R. In the preceding list, all variants are the same class as their base character. e.g. rule (a) would be "a labial plosive (P, B, or variants) followed by or following S (or variants) or a liquid (L, R, or variants)".

Syllable breaking rules

Rules in bold, followed by
Trans.lit.er.a.tion1, Example1 - Trans.lit.er.a.tion2, Example2 - etc. (Periods symbolize syllable breaks)
Consonants adjacent a vowel belong to that vowel
Ip.soot,  - Wap.tos,  - Peł.ten,  - Tip.so,  - Ik.tas,  - Kim.ta, 
Consonants adjacent two vowels belong to the trailing vowel
Oo.kook,  - A.la,  - Ya.kwa,  - Ka.na.mokst,  - Li.li,  - Ma.mook, 
Legal consonant clusters belong to trailing vowels, as long as not adjacent to a preceding vowel
Klak.sta, 
The cluster T + L preceding a vowel joins to that vowel and only that vowel
Pa.tlach,  - Tlemen.tlemen,  - I.tloo.ilh, 
Adjacent consonants not forming legal clusters shall divide syllables
Wap.tos,  - Ash.noo,  - An.ka.ti,  - Kan.sih,  - Kim.ta,  - Kom.taks,  - Tsik.tsik,  - L.ma.lo, 
A nasal consonant (N/M) will form a consonant cluster with "S" or "Sh" if the nasal, "S", or "Sh" is word initial or final.
Nsai.ka,  - Msai.ka,  - Snaz,  - La.Plansh, 
An "I" or "E" immediately preceding or following an "OO" or "OW", or preceding a W vowel shall divide syllables
Ni.wa,  - Tlemen.oo.it,  - I.tloo.ilh,  - E.h.poo.i,  - Kip.oo.it,  - Kla.h.ow.iam, 
An "I" immediately following a vowel (not U) and preceding a consonant shall be considered part of that vowel
Kwaits,  - Oi.h.at,  - H'loima,  - Eit,  - Fait, 
An I/E flavored vowel will join with a following consonant + "I"
Fraide, 
An "H" following a line consonant, or preceding a ZWJ + consonant creates a digraph equivalent to the base character.
Pelh.ten,  - Khel,  - Khow,  - The, 
An "H" not forming a digraph will break syllables fore and aft.
Oi.h.at,  - I.h.t,  - Ka.h.ka.h,  - Ka.la.h.an,  - Ke.h.tsi,  - Kla.h.ow.iam,  - Sa.h.a.li,  - Wi.h.t,  - Ta.h.am,  - H.um,  - A.h.a,  - E.h.poo.i,  - Ili.h.e, 
A "U" will join with either preceding or trailing consonants, but not both
Kyu.tan, 
A "U" will first join with lone (without a vowel) consonants or clusters
Stu.il,  - H.um, 
A legal consonant or cluster bracketed by two "I"s or "E"s will share a syllable
Ili.h.e,  - Isik, 
A nasal vowel will displace and join two adjacent consonants, unless the adjacent are similar (same angle) line consonants.
L(en)t,  - Munde,  - Lamp,  - Dans,  - Sacrament, 
A nasal vowel will connect with a following consonant if word initial or preceded by an H, X, or vowel.
Hundred,  - Hand,  -
A nasal vowel will connect with a previous consonant and break syllables aft in all other circumstances.
Roten,  - Seven, 


Archives of the Kamloops Wawa 1891-1900 (subscription required)

Dictionary of the Chinook Jargon, by George Gibbs, Echo Library ISBN 1-40680-924-1

Chinook:.... A History and Dictionary, by Edward Harper Thomas, 1935, Metropolitan Press, Portland, OR)


Kamloops Wawa No 1 page 1

transliterated character repertoire, p, t, k, l, m, n, sh, s, o, a, oo, i, e, CCMM, d,
Lines 3,6,16: compound vowel "Wa"
Lines 7,8: compound vowel "We"
Line 9: irregular syllabification "kla.ks.ta"
Line 12: irregular consonant cluster "t+s"

Kamloops Wawa No 1 page 3

Line 10: syllable "PI". cf Kamloops Wawa No 1, page 1, line 1, syllable "PE"
Line 7,8: transliterated simple character repertoire, a, o, oo, ow, wa, e(i), u, h, p, t, k, l, sh, s, n, m.
Lines 6,12: irregular consonant cluster "t+s"

KMW1 page 5, Prayers in Shushwap

Line 3: Abbreviations, "T+CPC+K" & "S+CPC+S"
character repertoire h, r, full stop

KMW45 page 4

Line 7: h-digraph "Dh"
character repertoire v.
Line 6: irregular syllabification "ar.t"
Line 1, etc: Abbreviation "Sh+CPC+K"
Line 13: Abbreviation "S+CPC+T"

KMW51 page 3

Line 25: 3 letter abbreviation, "S+CPC+B+CPC+Sh"
Line 4: Syllabic breaking "A.U"
Lines 5,10,11,&17: abbreviation "Sh+CPC+K"
Lines 1,3,4,6,&20: Ch character, Sh+CCMM
Lines 2,4,7,9,15,17,20,: Compound vowel "Wi"
Line 2: H-L, "H+ZWNJ+L"
Lines 3,4,9,10,13,18,&22: Compound vowel "Wa"

KMW59 page 1

Line 2: U+CCMM
Line 2: I+I+Diaeresis Yee

KMW68 page 2

Line 28: "Likalisti" sign

KMW101 page 1

Line 15: K+CCMM, K'. Note that it looks like a crossed K.

Chinook Rudiments

The Chinook Rudiments text contains an extensive vocabulary list with transliteration and English gloss. Several pages are included for increasing familiarity with the script, while others have notes on content regarding a specific inclusion within the proposal not found in the above documentation. The first page contains a full character inventory, except for the Salish specific "X", including compound vowels, h-digraphs, the numerical values of the first ten characters, and even some basic pronunciation guidance.

Note diacritic dots on Bear, Beads, Beef, Bell & Cheap

Note diacritic dots on Grease, Easter, Priest, Sheep, etc.

Note diacritic underdot on Thread.
My apologies for the antiquated ethnic terms.

Original handwritten text(s) to be supplied.

Written in Salishan languages demonstrating macron, breve, and acute usage and the Salish "X".
Text to come