IPA Scrabble?

I've been on an International Phonetic Alphabet kick lately (thanks to the linguistics class I'm taking) and have been contemplating the possibility of IPA Scrabble. As far as I know, it's never been done officially (by Hasbro or whatever.) Want to help make it happen?

Update 2008-10-25: Cascadilla Press may have something like this. Still looking into whether it fits the frequency and value model of Scrabble.

Update 2008-11-2: Cascadilla Press does indeed have a magnetic IPA Scrabble set. The rules are a little different (mainly to accommodate the difference between Latin-character English and IPA English) and the tiles are not designed for a regular Scrabble board, but the principle is there. They determined the tile values through gameplay, though it is skewed to work best for linguistics students, e.g. the less familiar symbols are worth more, and schwa is worth more to encourage multisyllabic words.

Unofficial, of course

I don't expect Hasbro to come out with an official IPA version. After all, the market is pretty small. Additionally, IPA does not provide the advantage of a unified tile set for every language. There would be a different distribution of tiles for each language, and the distribution in turn affects the point value of each symbol, which is written on the tile.

And that's where the trouble starts.

Frequencies and point values

English tile frequency-value graph

When designing the game that would become Scrabble, Alfred Mosher Butts used the front page of the New York Times to determine the frequency of letters in the English language. A similar process is needed to determine the symbol frequencies and values for IPA, but with the additional step of transliteration from Standard American English orthography into English IPA:

  1. Find a large corpus of Standard American English.
  2. Strip out all the words that are not present in TWL (or SOWPODS), the most common Scrabble dictionaries.
  3. Transliterate each word into broad phonetic transcription. (No distinguishing marks for aspiration, nasalization, vowel length, stresses, or syllabic consonants.)
  4. Count the number of times each symbol appears, deriving a distribution fraction that represents that symbol's share of the language.
  5. Normalize and round the distribution fractions to add to 100. This will be the number of tiles each symbol occupies in the 100-tile set.
  6. Take the inverse of the distribution fractions, multiply by 100, and round. These will be the point values for the symbols.
  7. Fudge the numbers around a little bit. Particularly pay attention to symbols that can be added to the beginning or end of a word ([s], [z], [t], [d]) to make a new word—consider lowering their distribution in the tile set. (Mr. Butts did this with the S tile in Scrabble.)

Any volunteers?

I'd like to make this happen. Here are ways you could help:

  • Find out if the analysis work's already been done. I suspect that someone, somewhere has researched the phone or phoneme distribution of Standard American English and published a paper about it. However, I lack the l33t research skills to determine which journals & keywords would be the most likely candidates.
  • Write a program to filter a corpus of English for valid words.
  • Transliterate those words! Or find an existing English orthography -> IPA translator.
  • Argue about the proper way to fudge the numbers.

I could certainly do all of this myself, but then it would never actually get done. Plus, it would be so much more fun as a group effort.


Responses: 8 so far [feed]

  1. jkao says:

  2. Jay says:

    u! ?? l?v ð?s ??di?. w?n pr?bl?m w?d bi ð?t j?d h??ft? d?sæd b?fo??h??nd ??t d?l?k ju w? g?ne juz.

  3. jkao says:

  4. jkao says:

    Huh. My first post may be caught in spam filter.

  5. Trevor Stone says:

    How would the rules and the distribution handle regional accents?

  6. Tim McCormack says:

    @jkao#4: Thanks, retrieved it. Jay's comment was filtered too. :-P

  7. Tim McCormack says:

    @Jay#2: I apologize on behalf of my blogging software, it seems to have eaten your IPA. :-(

  8. Phonetic Scrabble says:

    [...] Haven’t found actual phonetic Scrabble yet, but did find someone speculating about IPA Scrabble. IPA is the International Phonetic Alphabet, and it’s what we used in music school to [...]

Join the fray