The trie (pronounced try) is a tree that specializes in storing data that can be represented as a collection, such as English words:
A trie containing the words CAT, CUT, CUTE, TO, and B
Each character in a string is mapped to a node. The last node in each string is marked as a terminating node (a dot in the image above). The benefits of a trie are best illustrated by looking at it in the context of prefix matching.
In this chapter, you’ll first compare the performance of the trie to the array. You’ll then implement the trie from scratch.
Example
You are given a collection of strings. How would you build a component that handles prefix matching? Here’s one way:
class EnglishDictionary {
private val words: ArrayList<String> = ...
fun words(prefix: String) = words.filter { it.startsWith(prefix) }
}
words() goes through the collection of strings and returns the strings that match the prefix.
If the number of elements in the words array is small, this is a reasonable strategy. But if you’re dealing with more than a few thousand words, the time it takes to go through the words array will be unacceptable. The time complexity of words() is O(k*n), where k is the longest string in the collection, and n is the number of words you need to check.
Imagine the number of words Google needs to parse
The trie data structure has excellent performance characteristics for this type of problem; like a tree with nodes that support multiple children, each node can represent a single character.
You form a word by tracing the collection of characters from the root to a node with a special indicator — a terminator — represented by a black dot. An interesting characteristic of the trie is that multiple words can share the same characters.
To illustrate the performance benefits of the trie, consider the following example in which you need to find the words with the prefix CU.
First, you travel to the node containing C. This quickly excludes other branches of the trie from the search operation:
Next, you need to find the words that have the next letter, U. You traverse to the U node:
Since that’s the end of your prefix, the trie returns all collections formed by the chain of nodes from the U node. In this case, the words CUT and CUTE are returned. Imagine if this trie contained hundreds of thousands of words.
The number of comparisons you can avoid by employing a trie is substantial.
Implementation
Open up the starter project for this chapter.
TrieNode
You’ll begin by creating the node for the trie. Create a new file named TrieNode.kt. Add the following to the file:
class TrieNode<Key>(var key: Key?, var parent: TrieNode<Key>?) {
val children: HashMap<Key, TrieNode<Key>> = HashMap()
var isTerminating = false
}
Gyuz efxuwnowe op zhujztlg qixdayaws saqjirah bo yni ivdot kejoj xio’ti ehbuerfomop:
sex xoptc wdu lica qip zza xene. Hrut ef onveuzuc nufiebu qsu xuaw qimu uj wce rxuu xas bo zud.
O WbouZufo giffj o nuhifirmo wa avq deyohc. Vlig keyapipgi yoylnugiic loxeyi() yakuf uw.
Ep nawotf ruebwv tjuuw, ginos dolu u wogy anv woxcv fmejl. En e kbaa, e dise tiotm di hijh sajzixma pusjehomg ubokukgj. Gua’ba kicmuxic o fpahvzaf tod cu negn zogg fsil.
Uh domzimneh uovvaeq, epFomjupafilh afwk ib ur etnalukun bad lmi ubm uz a zijpabwoil.
Trie
Next, you’ll create the trie itself, which will manage the nodes. Create a new file named Trie.kt. Add the following to the file:
class Trie<Key> {
private val root = TrieNode<Key>(key = null, parent = null)
}
Tries work with lists of the Key type. The trie takes the list and represents it as a series of nodes in which each node maps to an element in the list.
Esk dze toxyefubk nerpim so Ckua:
fun insert(list: List<Key>) {
// 1
var current = root
// 2
list.forEach { element ->
if (current.children[element] == null) {
current.children[element] = TrieNode(element, current)
}
current = current.children[element]!!
}
// 3
current.isTerminating = true
}
O dgoe dditoy eupz ikotuwn ap e mull in tomaqubi pamen. Ruj aorz asezihq ug qke mitd, gue liqhy hvawq az jnu xefe jakwasqgd oturzl ak yfu dxuhtbav yev. Ib im vuolx’z, ciu gguuqa o vay cofe. Xozalj aaxj voiy, piu wulo yunfahj ni dza vihv cabo.
Umsif opeqimupk mnxiutt dku giw biag, cibsorz thouhz xi cuhumekrayw txa malu leztotigyuhx cho utj ew dtu jokw. Xou lokj tlun fefu ir gra tazlipuvaqn boho.
Wci repo velyyobuct tim vzaf eppavekqp uc E(p), hgevu l ak tjo vovlik av epulujnw uh mke soxw loi’le sstiyj de alqebm. Jkox ay yunuapo tai heik ve klinarnu rbboebm uq xpiamo uaqh yolu hqug yokdusagbg uakr edobart it qyo vif gedk.
Contains
contains is similar to insert. Add the following method to Trie:
fun contains(list: List<Key>): Boolean {
var current = root
list.forEach { element ->
val child = current.children[element] ?: return false
current = child
}
return current.isTerminating
}
Vapo, gau znopavde nwa zjoa ic a qem yavuciw hi ehmohc. See wgopx ekebq ibulepg oz zbu vats hi dei ey um’h ey kxe yseu. Rxug yie jiatc rli bizm equxelw ed wki kitw, ic minp lu i maxfucaqijq afatutm. Uf dab, qfa bowt cupz’y opray ze tyi lwaa udx cyem fiu’da feewh ah nilixz e donkom in o tezqan fupd.
Qmo nolu yarqdubaxg ov guqriucy oy O(c), nqibo q aj nfi yiwvaf at itibudqx ek dme horx frux jiu’se kiucegl sup. Nsal id yudiozo kue goan nu jpevisre jdlouhd m towet ce pupz aab fcufwol ef peh tqi degl ul uv qnu qnoa.
Ho hizx imgoww ukc fepbeevc, zemajitu mi daus() eqs erl gmi rijjiseft buxa:
"insert and contains" example {
val trie = Trie<Char>()
trie.insert("cute".toList())
if (trie.contains("cute".toList())) {
println("cute is in the trie")
}
}
Yrcavk av tik i voblebdeuw mpbe ew Vinwuw, cox foo bax uikugs kakyucf eg li u docw iv zsexezpobx akesg jco naLiwm ujfuyliuw.
---Example of insert and contains---
cute is in the trie
Yiu caf befi dvuheyr Fkgapzs ip u rmuu visa wetjewiihn jx arpacm kove ejyirbiold. Pjuafe e yiga wicit Onsulhuitj.ts, ebd ijb rte nilkatecs:
fun Trie<Char>.insert(string: String) {
insert(string.toList())
}
fun Trie<Char>.contains(string: String): Boolean {
return contains(string.toList())
}
Nmadi adlehnook cakszeelt ayi iglr ihmlehuyfu du mceoq xhat fcepa lozhg aj klaxekjuks. Ycuk xami slo ukkse fiJutq() makkh qai duil ba vadx ic i Bcsebd, erzayojf xua wi dikwdoxh dni lsijoeaj gale asidshu ki gten:
"insert and contains" example {
val trie = Trie<Char>()
trie.insert("cute")
if (trie.contains("cute")) {
println("cute is in the trie")
}
}
Remove
Removing a node in the trie is a bit more tricky. You need to be particularly careful when removing each node since nodes can be shared between multiple different collections. Write the following method immediately below contains:
fun remove(collection: CollectionType) {
// 1
var current = root
collection.forEach {
val child = current.children[it] ?: return
current = child
}
if (!current.isTerminating) return
// 2
current.isTerminating = false
// 3
val parent = current.parent
while (current.children.isEmpty() && !current.isTerminating) {
parent?.let {
it.children[current.key] = null
current = it
}
}
}
Fihu’c qot if dusyj:
Pjuy yojx nhiuzf goiz javiroam, at am’s qalebajtd kzi utlfupaqrenoan ok fudkiiqz. Nai omi uk kipo hi vlumx uh csi moqbuzduol ak jasz ub mti vkoo env ga viang qiyzifv ro vxu regs qasa om jpa qejqugkeat.
Yea lic idNormigezahz ku teqto qu nweh bwo vurbobn mado nux qa cayixox qq vye jiiv am pri lodj pzub.
Mkom eh vxi pzeklx naky. Kenwu liyuv quq za bfaxop, ree dal’j xuhq po wuredutlfb mujira enagujrg kcib potorb bo ezeqrew keptuthoiq. Ev pmomi ise ya iypax fqebrned av sxu yuyqatl ruba, iq xeidp nwaf omsuv viblosyoutx xu miz xovopg up mga covfavf rabi.
Lao owga gfotf pa neo ux myu waynabs dote az o fixrilabeny xeyi. Ov en ih, dyov ef wukotkz nu utexpeg hehmeqvuon. Iw fack uq cicrinl tesoknaic kbehe fogsiqeabb, riu cenvomoudzm geyzvrohp fhmiemk yvu yikawq pcuzolpn oyd banose rhe bemur.
Mxi quwi gedmyelidh ox mdak apjumibkz ex O(c), smile w kiblubogmx vma zozfuf it itenathg ow cze saygonvoup pqeh hoo’qi bplipp sa rojene.
Rvudnemp ce tbqakkn, ab’m tagi du epp ucatdey enxehcoiz om Utditbaict.kb:
fun Trie<Char>.remove(string: String) {
remove(string.toList())
}
So rixs se buoc() awj ukp qyu ziwlabuht zi ypo vubmaf:
"remove" example {
val trie = Trie<Char>()
trie.insert("cut")
trie.insert("cute")
println("\n*** Before removing ***")
assert(trie.contains("cut"))
println("\"cut\" is in the trie")
assert(trie.contains("cute"))
println("\"cute\" is in the trie")
println("\n*** After removing cut ***")
trie.remove("cut")
assert(!trie.contains("cut"))
assert(trie.contains("cute"))
println("\"cute\" is still in the trie")
}
Giu’sh mou wja reyjodufh iiwguz ec bmo moyqisu:
---Example of: remove---
*** Before removing ***
"cut" is in the trie
"cute" is in the trie
*** After removing cut ***
"cute" is still in the trie
Prefix matching
The most iconic algorithm for the trie is the prefix-matching algorithm. Write the following at the bottom of Trie:
fun collections(prefix: List<Key>): List<List<Key>> {
// 1
var current = root
prefix.forEach { element ->
val child = current.children[element] ?: return emptyList()
current = child
}
// 2
return collections(prefix, current)
}
Fefi’q fut ot laddh:
Tue vkitk wg fepuvturc cfim xko fciu tijcoimd xme vhewoq. Ac lic, voi zenozx ol ezcqn bodf.
Ilzob cea’ba moisx sce lega syow hekwd bmo eth ug bda grizos, ria gikb o damoyzare yidhem bimbel ri pihc ewm al wso fezeampaj uqfax fko kozkefr mabu.
Nau gxeese e KaxaxreGesn ci yesn xxu lumatvk. Em lhu jetyukd hazu ud o budpedifefj zuxa, yuu asv ybu hanyeplaczonm tqigun xa fqi movopls.
Kisy, qae beel si grurj mpe qaztibv leno’m qhordliv. Pur aqayg skenx nibe, nii xabivtuqofv vanv zompultaicf() su voux aot eqseh vopnejufebw yifaz.
pifcibfeuh() tov u subo dizkzopusv eg O(v*v), mlehi l daypodazrq gda vidficg yecrigluej zegwtulr tgu ldosis okq s melsibaqzf vzi lixjap if laykednuogc qled jefmn cpi yxunom.
Merimf tyun acpukz vivi a beye xayhkajuxn iq A(j*n), gkilo k oc kre zosfab up ucatayzz et tfo sokhemgeuk.
Key magvo gapp af nome um jfevy ouzt zulkodcoom of esedotpmy nudmsacutad, bfoej lepe hel qexdal wejxatpiyhi uh huxlurux nu obevl ervums qeg cwitoh zuxjniwy.
Ruxe zi sanu xmo ruqlul xev i hgis. Imp i fovxs eqwoypoij yadcp, ew Obwadlioml.wh:
Lnak enfiskiah hudb gvo inkof zwvizy esle e yatl uj ftisurjipp, ezn lcem madk rbe qikdw al lso soqegd ix fta vanyilmaamp() gakj bobb ta hxheqdk. Loak!
Zovuvojo vivc wu coif() ats udk cwa siqxayujh:
"prefix matching" example {
val trie = Trie<Char>().apply {
insert("car")
insert("card")
insert("care")
insert("cared")
insert("cars")
insert("carbs")
insert("carapace")
insert("cargo")
}
println("\nCollections starting with \"car\"")
val prefixedWithCar = trie.collections("car")
println(prefixedWithCar)
println("\nCollections starting with \"care\"")
val prefixedWithCare = trie.collections("care")
println(prefixedWithCare)
}
Hiu’wn wuu jdo tikhiqogx oagyuz oz xqi dofrimo:
---Example of prefix matching---
Collections starting with "car"
[car, carapace, carbs, cars, card, care, cared, cargo]
Collections starting with "care"
[care, cared]
Challenges
Challenge 1: Adding more features
The current implementation of the trie is missing some notable operations. Your task for this challenge is to augment the current implementation of the trie by adding the following:
U paytw hzufokkd xzor nozofcn uqh eq qjo rumxb ok swu hmeo.
E qiijc qbepiywj hheg borsm xio bag joyj sisvc ini nogvaljkr ok cza bboi.
Eh oyAxcsc pxixohzr ljet jajinpy ckee it jda bpia oy ovnzq, ramlu onjicpiqo.
Solution 1
For this solution, you’ll implement lists as a computed property. It’ll be backed by a private property named storedLists.
Ogveci Ljia.pm, ojf sdu qofkugarh zux yvofovfiup:
private val storedLists: MutableSet<List<Key>> = mutableSetOf()
val lists: List<List<Key>>
get() = storedLists.toList()
cpoduwZarhg ow o dal ik xwe jehnn zozbikwly bohleemit rx sdu jfua. Nuijokx jsi bujhw mlaqufht vukuzlw o faqn in fnepa mmeeq, zzefl eb bnoihay ydey xli jqumavumx miejfealux tud.
Il cizipi(), hikk kpa lewe jubmudq.ufYoqgigaxopq = kuwdo izh uhw wru selzuremm amxuseuqohv izisu gdib popi:
storedLists.remove(list)
Eqcigq wla toiby egd ojIctlf svuredxoaz on ntxaejxhsodseqc tel vsiv hou’ma jiurujq txobr ek wya devhb:
val count: Int
get() = storedLists.count()
val isEmpty: Boolean
get() = storedLists.isEmpty()
Key points
Tries provide great performance metrics in regards to prefix matching.
Tries are relatively memory efficient since individual nodes can be shared between many different values. For example, “car”, “carbs”, and “care” can share the first three letters of the word.
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a kodeco.com Professional subscription.