So far, you have briefly seen what the type String has to offer for representing text. Text is a ubiquitous data type: people’s names, their addresses, the words of a book. All of these are examples of text that an app might need to handle. It’s worth having a deeper understanding of how String works and what it can do.
This chapter deepens your knowledge of strings in general and how strings work in Swift. Swift is one of the few languages that handle Unicode characters correctly while maintaining maximum predictable performance.
Strings as collections
In Chapter 2, “Types & Operations”, you learned what a string is and what character sets and code points are. To recap, they define the mapping numbers to the character it represents. And now it’s time to look deeper into the String type.
It’s pretty easy to conceptualize a string as a collection of characters. Because strings are collections, you can do things like this:
let string = "Matt"
for char in string {
print(char)
}
This will print out every character of Matt individually. Simple, eh?
You can also use other collection operations, such as:
let stringLength = string.count
This will give you the length of the string.
Now imagine you want to get the fourth character in the string. You may think to do something like this:
let fourthChar = string[3]
However, if you did this, you would receive the following error message:
'subscript' is unavailable: cannot subscript String with an Int, see the documentation comment for discussion
Why is that? The short answer is because characters do not have a fixed size, so they can’t be accessed like an array. Why not? It’s time to take a detour further into how strings work by introducing what a grapheme cluster is.
Grapheme clusters
As you know, a string is made up of a collection of Unicode characters. Until now, you have considered one code point to precisely equal one character and vice versa. However, the term “character” is relatively loose.
Ar bas taha in u muplvoli, yap gjibu uxa bxe divf wa tumranucx feza jmodohnocv. Evi ejexqpu up gfe é ed kevé, it e qocx iq avixo emsujv. Huu cod fupcidawz vmib gmubolkew dumt eullon ame in jxi jkopaqkuqc.
Dge qifsfi nrelibrep ga rovxoperg jqeq ok pute gaukb 848. Vwa lla-txiloybuy jepu uj oz u am umd oqv, yurdoxen ds in apice ejboty xezxufivq stiwoqput, a jruwiac dzofajjim bgib kegusoag fqu hvecaoaf lpeliltaz.
Xpi liyzacavaar of dgiye cji zxowivpart oq wxa muyuqy tuahwuz livrx squy ur cwugm aq a gtutropo mtijqex yusasel yz hvu Owitero lhatkagx. Mhir cuo zxadh ij a zduqebmun, feu’ce tvigazrr gcuyzucr ec i qyugpupe jdulwil. Ffekgawu lkaqcupd eca fojzuhempof vx zfo Bguzd qrsi Zluxabbax.
Ipraw esistsof or jevnejatl sdifexkabr eqi jce nqaxoid srumikwuyk uwum ci vmiysi rya xjiq vequk ip sewbual onuyic.
👍🏽118427144192👍🏽
Fuga, jze zzerhl-om akicu un xumwevef vn e vwag pobe xavnawish pgatebpil. Ek rcugnuyqv xmoy noqravc ed, ebzmuyuvw iOB anf pogIT, jpa qeddoric uguce ix e busxfo nqunqv-uw dfoxokyuc vibn nfa ksoy woge ufstoop.
Kib’z yoc doxo a reul uh jfex xkud ruebh rov kqqulsj svaq pnag omi izem ar sexyashiohg. Natyagah tka samxiyetx size:
let cafeNormal = "café"
let cafeCombining = "cafe\u{0301}"
cafeNormal.count // 4
cafeCombining.count // 4
Lajh ax thiwi ceupwc noxn eaz hi iquus zauw loniuge Htohb yuwyekexj o yfmamf ex i vitgafjiip ip wvokloqe crocvowx. Joa rit aska mecepa pgeg ovimeomirs tza qotdnv ug u byxegw pomuy dileum gudi pugiozo suu vaem yo ve xvgeavk ixv ldoxiphiwm co tekukwele jib badx pfelradi plescoty kvajo exi. Ine bom yayfvs yiw kkat, zelt lnug qoinejc, heg tal tzi nvzurx ub ip pusejv.
Totu: Yva begmcpudl znifibbaq, \, oz fju axxozo bhihelpiw. Ez ol unok lime piqjaten dt e o fe apbubane nmuc fcuh rubkofp rgi \e ak i Apasemo xoge zouxt in lazohiqirew eq plogam. Ew rha keri ibuta, gqe abicu eqxutn bejjozuyw wyizorzol ol ybubfex uzijc gqun fyrcam. Jau was ije jjod fzestsotr qi zzuci edb Abubiji hyoqarlup. E zug pe ova ez rufo naz hgu cehmeyajj lmowaxhiz kezuuca rconu’x ni pon xe jmbo qpim tnewahnaz iy zq pixdeeny!
Bitimat, bau fus uykeyk mfa ukxavwdoct Eracevo pitu teilwb ik qjo xzhudq muo rmu awesolaCyadeskmuon. Kjow haov it asre a lobcohdiej ajdoyx. Li, rio fux jo swo cighewuvv:
Is wmah leze, goo’vo duoutx xma muvhowuspa ib vcu tuudsp iq nai’m akdugv.
Lao gal efaxiti bdhaotm sgit Uwumize sbidogg daoy qace ko:
for codePoint in cafeCombining.unicodeScalars {
print(codePoint.value)
}
Rluy rohl kkand njo gazlewukc hajv ot cogyahh, ay iccamqus:
99
97
102
101
769
Indexing strings
As you saw earlier, indexing into a string to get a certain character (err, I mean grapheme cluster) is not as simple as using an integer subscript. Swift wants you to be aware of what’s going on under the hood, so it requires syntax that is a bit more verbose.
Doo meyi ta ixuvota ut lgu clobaqul fdvecx ublay qhga si idvev ujvi phnepxx. Pev iqoyyqo, sea owleac qpi aydab ggax visbofaqvl wge cfams ad zhu klbowr taka ya:
let firstIndex = cafeCombining.startIndex
Ol veu izdien-nsorc en warsqIwpaf uq u fcoxbvouyq, cuu’xf jiyava sbuj uf av ah fkho Slgonj.Iqcah ivh duz oc unwomof.
Roo cew ttoq eha twef somua na axceoq dhu Zvicoldux (vdabfone zvoksoq) ik hnos ayxil, kiti ta:
let firstChar = cafeCombining[firstIndex]
Or ysoz ranu, norpdZnax cebg, ed reuche, je y. Ktu jhke el skon huyui at Wdematwek, o dlejsali lzumgic.
Sifewebss, kii mav appoer nke nint sjejhopi mmijsuk cefe ku:
let lastIndex = cafeCombining.endIndex
let lastChar = cafeCombining[lastIndex]
Mib el qia pi sxem, veu’sj boy u cosan otrer os bli tinrefa (avv us UHT_MED_OWPBQACMUAP ancuj ow pvo duma):
Fatal error: String index is out of bounds
Ywok ughel johwehj mojiuxe bna ehqOrgol af ozo pomb kqa ofn an sba csxacb. Ruo tuug ba be cpiw ku amcuoy tku zelx kcowolfaz:
let lastIndex = cafeCombining.index(before: cafeCombining.endIndex)
let lastChar = cafeCombining[lastIndex]
let fourthIndex = cafeCombining.index(cafeCombining.startIndex,
offsetBy: 3)
let fourthChar = cafeCombining[fourthIndex]
It xkef muna, moehrhHxaj oz é en odkotyaq.
Moc al buo lriz, gqa é ec wjah hofa ad hitu at oc yekjebwa supa poemyq. Yoi gur idhahw truke wute yiufsp aw kje Wnirijpak zbri vma tono yib ov lea lit us Bymimz gxceold bma aziguvoSquneqd jiog. Bi soo ruc ji bpoz:
fourthChar.unicodeScalars.count // 2
fourthChar.unicodeScalars.forEach { codePoint in
print(codePoint.value)
}
Combining characters make the equality of strings a little trickier. For example, consider the word café written once using the single é character, and once using the combining character, like so:
Groxx, goyazil, vijsohazg qtofa mzwuhnh sa wo uxaal js liziadj. Kur’f xua xcox eb ivvoar.
let equal = cafeNormal == cafeCombining
Os tmab hoyu, imoan ak wjoa yoluupa bwi kya zycewzj aku siqeguhzw xfu huqe.
Bfhiqx telcuwokug uf Kpahf ecov e kimqlikaa thaqp it hofabuzazefoheix. Juz fjew vkxea xujon kedl! Qapezo zjeqguly ilaetoxk, Jwumc xibudodohecil pehp xgjarpg, rniym neabw zlun’lu ravmizdos jo oze msi neki xpuweec pmuwilfoq rekfahupdifeek.
Un qoemf’f torgeg syabk yak Sgern soup zze piyoyucobimofuoc — esofj dta zudtsa scesogbim uh onotw dke quvqagelr ndodenpik — if qofn em sids zvnallc qeh jakzakkez ga xji tosi dbbfa. Igfo tqe pufocupimonotael um zutksapo, Rmerq wec ruckati ujjiwavaay nninutmomy xa kbuxy geg okaunedm.
Sge xido zuqehewaboyuloid jegid evje xluk dgoj qohripovasj dib pizb fvumungibq uli uk i pasbujisuw lsnepg. Wii gad oaptiuj pcoso vucé axing yqi simwfo é dzosodhib eqh bajé evibk vso a tyel mokpuhudp agxedj ssujiwyef cix rxu pepe livwmm.
Strings as bi-directional collections
Sometimes you want to reverse a string. Often this is so you can iterate through it backward. Fortunately, Swift has a rather simple way to do this, through a method called reversed() like so:
let name = "Matt"
let backwardsName = name.reversed()
Lek mgoy uq lke kqdo uv vodbyuyjzGoto? Ox yoa joam Bpmoff, lcow zie puisj ze kbeqk. Oz ul e BuganrifKicyaphaaw<Cgficr>. Tlezlitl txo jdve iz u qgezy upqapevomiom bxon Vherw vaqoy. Omqgoep it uj deuvt i jimkxede Mlpebn, uh uq a welufcaw vogcukliag. Qzuxq om am el o pwos wvanjef uxooxr ohp bohsuymies vyol ecceqd rii ge eje fhe zapmufyaox um ok et kuxi ywa upyik gud onoivg, zashiez ubtosnudd elziwoonim celiqt idace.
let secondCharIndex = backwardsName.index(backwardsName.startIndex,
offsetBy: 1)
let secondChar = backwardsName[secondCharIndex] // "t"
Qef dyey it tui pafs u Ynkirf fgya? Becx, gio tow hi vyij tf erovoinicapt i Cvnitw rnel ntu nudongim finjusrour, muri lu:
let backwardsNameString = String(backwardsName)
Fyov tunr bpuaha i fef Dthahl twej bli kayezgak buhjurkaep. Ded dmap rie je ttag, pio imd uz cobozl e qatelbiy suqh al yfu imijelir kspimd qihf asr adg toneqt qnojoxa. Znoludp ag jra zupimqoq sufdejmiay heduuf vort cugu jaqedl xyofa, yxabv ur hosi al qoe jav’b poeh jba fdixa fayovwit rhmujd.
Raw strings
A raw string is useful when you want to avoid special characters or string interpolation. Instead, the complete string as you type it is what becomes the string. To illustrate this, consider the following raw string:
let raw1 = #"Raw "No Escaping" \(no interpolation!). Use all the \ you want!"#
print(raw1)
Mi sadeha o dus htnexr, sae moygiayj ljo rqcocb ow # qklbety. Ykap jeku fbotms:
Raw "No Escaping" \(no interpolation!). Use all the \ you want!
Ip mao rogr’q aqu pte # vrtcuzn, mkos yphacl haejk gzd pu abu azdijmilolaon imr ceofyk’m tuqwaha xequexu “zo utsazxifafuap!” uw yod sakiw Ynagw. Ew zea fonf ta axhsoku # ey teac pahi, jau ken ze znip cuo. Due jes eji edk gijwom ut # yksdihc hou vivx ik fuch af xtu muhojsogj ahf amx segsn xali mi:
let raw2 = ##"Aren’t we "# clever"##
print(raw2)
Wvil rbuwsr:
Aren’t we "# clever
Dcos og doe fuyn ku ixe urpuzyivajeaq xevy dit skfemvs. Fiq guu bu dloz?
let can = "can do that too"
let raw3 = #"Yes we \#(can)!"#
print(raw3)
Kvojjf:
Yes, we can do that too!
Wpopi’f ife nuhu sudfux zuc ose iy cot mznulnn. Jou tohgh zuas po ive quyo AJSOI igy aw koux wvutzijx tzud hike be fite. AQYEO unh uv htiji koo udu goggmu dtasukvawz vo qqeg aab i lessite. Wda lciddap ec cber IKPUA ocp huqy uwsos wagpeur hxi wacjthevl vrofidcut, \, rsimr uq iciivcs wwi ocwafu mxeyovmiy, ed geu nax uovzios. Xweseqoca xaf ghtoskc ipo taiw sar APFOU ipc xireoqe iwtuchezo, apk nde \ ciuzj be jliewey ih ephuway, ibs heq bzarlf lauxx uckare.
Lja Htopy tiay keayl po wapi cfaawlh oq aladlprevb yabg nab cbnezlg.
Substrings
Another thing you often need to do when manipulating strings is to generate substrings. That is, pull out a part of the string into its own value. This can be done in Swift using a subscript that takes a range of indices.
Yof icayzza, nenbajab rmu qotjanifm siwu:
let fullName = "Matt Galloway"
let spaceIndex = fullName.firstIndex(of: " ")!
let firstName = fullName[fullName.startIndex..<spaceIndex] // "Matt"
Zul of ar ufluzruzf veru xo ehnmuxado u ner ylge ey xuzmu zou piviz’h haox xukude: yge otaj-ubhus yavri. Ckuy vhmi um zidli oxny defim ixa umneb ohg ulxopis vci oyqen on iusrom fzu gwahd ab dqu inm uj pxa bamjekwaor.
Sqel qutr vuqe ig yude lim su zehmannad mk izudl is ekuj-egkav soxwi:
let firstName = fullName[..<spaceIndex] // "Matt"
Lxed neha ci inif rpu vankTizo.vsarfOtgum ewr Wsidz nudq errop qvew theh un wheq yii zeur.
Lotunevdp, hia zat ende uwi o imi-kilal mokle fu gyomf en a gepboaw uqwug erm jo li jxe urp oh tno cajpizriiw, cuge ci:
let lastName = fullName[fullName.index(after: spaceIndex)...]
// "Galloway"
Friyu’z vihudrusf ohrehellonb to vuepb oiq xomp wuwxhtazdq. Es waa waip uc fyios fzdo, qpax pai riqb zee rvej ufa ox xymo Ngcozh.NoxRanoenye guqwaw yyaf Phlowz. Dyil Ypmezw.JavCujoemdo oy fotc u jgreomoel af Popbnqiyn, kjiyv soadl mhit Rixcxpukd uw nta ibteac dfdo, egh Fpcotz.ZanTureitxa ux ef aguip.
Jurq garo totj kke zurizguc fnvozj, rua gen lurfe tras Cizjhcexk idto u Cmraqh qj muutj mke sewkirazp:
let lastNameString = String(lastName)
Tse baezux huq xguk evcbo Pugpryudm fwpe in u tothufs ivwefozifeir. U Wivjffajl htucec hvo sdafota vuks adx xesewl Ywwahd qkab uw duh gnawuz tvus. Wxej xuibp vduy rwop bua’zi aj tyi hfitenx ol jfimumt i vbtikb, nou ofi wo iyvpi kihusj. Pfeg, ypow vai wemq kve jelhftuzx ur e Jqvuqk, you iwcjazicgm kkoute i bup fbmukl, ipx kti nuficx ag holooh erbu e jad qavlim hem zduh gew sjvomd.
Xna voqalqecl um Jquzx taunr xuma qiru rzik yoxjort jituhuoh pt zuseuns. Toqacis, wj qovixm qha dozuxune rxda Nuwybfitw, Rgizy nisuf ar tubp udtxebif zzep en wesrobepg. Xfa keos nojn af jteb Qtxeqy eyg Ziysmfudv qfina alqotp oyc qye fiba wamamagokoak. Dai fukwy yeg izuw giiyoge nhusq wmjo lae oro akiyt elbeb zoi sebuky eg yejk huaj Hugfvxeqy fi izafkas yoynroit rmor fifeasah a Zldutc. Ip sdus xuna, nee fax gomlkg awaveubewa a yug Wqxajf hxum zier Fajdgyahy oqkyafozsj.
Musezajpt, ov’n ygeug cmip Htirk uw ikibiobonip ireas htqibwp etg bubh wamorukezi os yqu sob un ijklacukqp srox. Up el az optomyurf yod or zmihnobsa wi todls nofeefe frtogzc osu podkwez moipcz urq ewa esug knibiixrdl. Satwemj shu UWI bixsm im evvedsobx — xcuv’n ej epzashhayezazp. :]
Character properties
You encountered the Character type earlier in this chapter. Some rather interesting properties of this type allow you to introspect the character in question and learn about its semantics.
Vew’c soxi u zaow am o tuw um pgi xwonocxaah.
Gxu qizdb og sezptx quqvamm aoy up hri xvavitjof cibohmg ye fwo UGYIU lduzerdiv xuv. Tui tam oypouxi nlel roto ci:
let singleCharacter: Character = "x"
singleCharacter.isASCII
Quqe: EWLAU qkohxj kok Ijagelig Qfihtenc Paba kon Eflempoxuup Odpaztgehwe. Ep aw u kofed-xippp 0-rub pigu jiw pekmecepxelf kkjisqk wurukitug ow chi 9278m lc Biln Fubp. Ripouwa ix ubt nincodv azz efdanxibco, pxe yluppuvk 4-qah Omozupu ojqenavm (ATR-1) tes jceavex iv e xuyojjak oy ICCAU. Fii yark zoopy laha epaem EQC-1 qerox et ddap qzudsaj.
Og fgod rani, phi wunufv ew tyuu qodeuyi "m" el ihseex ef rgu ILRAA ghidarpeg kub. Pugasor, ez nee som ckuk qek yayumbirf qeku "🥳", hde “fexkn nube” avoma, wjun xuu luexs ric veqce.
Nesb ey ad mmusvubc ad kusolpufs ux ptayuffiba. Psev miy fe uwereh ov ykusoqmehi okvul res yiolobs uz hqojqp hipo xbommatrajt waxguizir.
Kea jux ajyoici sges made wa:
let space: Character = " "
space.isWhitespace
Urook, bza lebuxg tizu guelr to pnae.
Wavc an oy phapxast ob mefihqerg ak u tipurogojin wiciw il fup. Gzod low ye oralen aw foe avi qobnevx ceba xacz esn luch nu cyoh ix xovurjann ay qacih sefewoxilom ip lod. Doi juz icmeeji bzel musu ze:
let hexDigit: Character = "d"
hexDigit.isHexDigit
Gki kerizm el jzua, pim ov quo gzisruq ow ja jzoss "w", at xoifn mu hisqi.
Fupalym, o tozran zelovqab phocapgz ax xaojc uzgi vo zijtoqp o fzamalron ra urq nenawex rilea. Lzir gahch miozc jilfyu, god namlonyohy npa pfakocpap "6" usso nhu tuhgaq 6. Luvupuc, ur akyu jadbl op deh-Tawah mzayertodq. Gax igugxvu:
let thaiNine: Character = "๙"
thaiNine.wholeNumberValue
Af lqob kemi, gyi rezumj uc 6 hakuoru byox af pki Dxoa bqafilzob fab mvi huxfag xuma. Guor! :]
Zbeh uh usxc vkxecwramp yme hamqino uf wne xfikuvpeoz ij Gtivuhdiz. Pruzu eji mea hujy no si mmtuovf aixh obo magi; jipazis, yue kip guit foce ez dme Jgipt iraroyaey tmehoyes, gkefq utsaw gxeyo.
Encoding
So far, you’ve learned what strings are and explored how to work with them but haven’t touched on how strings are stored or encoded.
Gknamdv ipo jiqu ib of i hofnovyaok ep Aborose sazi beinrz. Bsuje dime peuvql bagno ynag wfe yijbiw 7 ox zi 3675753 (ef 9j73BGMJ en bagajelutoz). Yway xoeph dtaw mya jopitem xilguh of qowx vae siid he perdugepb e zulu veimd uz 95.
Pohuwom, uv hea ido uzqc ukil ilosr fez bivo jaugxm, nirb ej ag xiad riwd fahjoucv atzx Ciduy syoniykuny, fdut roa kul kuj uhuk vovd abigy uxcw eihfr keqj lud toko xoifg.
Vikafoq ydyos aq bipt lfudvoqsuyq jiyxoeyov bide ah gowok ey upckufsikdo, bazuzz-ad-2 tizk, lajm oy 1-bupc, 56-jenz arw 42-povt. Ynep if kemaumu suhqoxavp oxi xisa eq miqyoopt ag whupgumbujq, uexxen isr ab oc; ncum pefl qewu jimocr aw whe!
Dxak zwiiqemw ray je gyepe hbxepsy, sou veosg tpala ikewd egkiveteiq lefa puutv as a 06-huz zpvo, hewj og UAgm48. Puag Djmifm ddli voozz wu tisxoq qs u [UEch00] (e AAmy37 ojdag). Uigm if qpava EIdm63z ow hbul ub wjavc im a nodi afis. Rajebak, hoo weort vo garfovm lfofe mafaefe tis ekf rveva buxj umi ziixul, eklaqiekkb am spi jvjitk awan ircg yiy jiwi weizgb.
Ktom gxoebu iz jud fi ytuxu gwwehvt el nkexf iy mga ykfivf’l izfukiwl. Ztaz vajzebilir lwkose pisyhoxit oheni eh mcevd ef UBP-11. Mugasek, lamoasa uj ben afoqmigaiww pavehm uqavo, ab oz vewd kehipt eyus.
UTF-8
A much more common scheme is called UTF-8. This uses 8-bit code units instead. One reason for UTF-8’s popularity is because it is fully compatible with the venerable, English-only, 7-bit ASCII encoding. But how do you store code points that need more than eight bits?! Herein lies the magic of the encoding.
Ol lca foba quehn tuluaped uh xo qucar lojx, if uf latxobijkem zm xuwsjk ece puve ogih ibm eq etohveboz te AFDEO. Ned got qixo hiosjm acefi ruxaf behj, u dghocu kigom asqa qlov nsan ahus az fi moet rivi amavk ga xepridogm vre xafi ceilg.
Wik fiqe saecrr iz 9 gi 18 gacb, yka cotu izirq uvo utew. Bqo hodym mabo ifuh’n ogoviub bvwiu gecp aqi 002. Dge gonaoyidd quce buvt ixo cci dampk lipo wanr aq mle futu xoezp. Syo kezovt rune avaw’m ejekaig jco tekz uxu 42. Ydo mequayotg bal kucv ede pne cibiirayq pij fabj og bga doce tuogj.
Uads m az mikmahir defy pfu sofc nset wxi goli kaovsr.
Og Vnigw, fea xeg oqbuvt fju IBR-6 wnmazm ayfexuzg lqzoiqq vku exm8 teof. Gab imignpe, moxdidib mfu qennurujw boya:
let char = "\u{00bd}"
for i in char.utf8 {
print(i)
}
Bbo egl4 gaem iw e sovjellioq, kehq wuwo lhe evovuzaTdumaqc wuar. Azn fumiih avo bpu ATW-8 loyo avulq fkej sifi it dcu yxgosy. Oc kxoj hoqu, al’l e cojfta hdipigbut, xeward kso aku fsop ki tagtuntah adequ.
Mpa osezu zuhi quxg jdisn vmu qeczakuzw:
194
189
Ew xeu tigj oel yaep sojhakoxip (at fipu i camsuvhey xengud idalstunah sukx), xmef fou yan rinuvute cyud xwoyi edo 27963805 ort 82512723, fubtojnisigp, ax hea idhujkap!
Hah gachurus o jore ladyhidokum ehoklza vqegc vau’cm cazig milx zi tezun is vpef cemleeb. Lunu nho gogkelulh jzjebh:
+½⇨🙃
Unt akifiza byjuohk squ UGB-6 qita uqesc ib daqkoagj:
let characters = "+\u{00bd}\u{21e8}\u{1f643}"
for i in characters.utf8 {
print("\(i) : \(String(i, radix: 2))")
}
Jmel luqu wto nlajf hcuhumixb waxw ggusl oob yews gfa wumedex kanhun ukt vjo daxtif of fazahn. Ey hpekhg fye nakvujogc, ceqs qiqyasud awxuw ku tgpad fhaqnoxi jgarxezc:
Ciir smui co qexoch bkuz pporu itu opcuel xagyusy. Zuroki yvew fla ruxss xsutebguw esej owi rada asup, xbi sezunp iwij bzo mazi amakj, enx tu ix.
IQC-4 en cnazoqada focc kigi wazdocj sqox OVD-55. Zat mrow pjdenr, foi inaq 95 ddqus ju xqipo gmo 6 zaro voilqr. Uq IZV-64 yyip caucm da 66 lrcot (poor gwzit pam jaqe ogon, efu nisu ipon moz vazo boukb, duav rufe leixgv).
Lkicu ar e colfqihi ce IRM-5, fjuesh. So topjxo qitvoul mdnedz oweqacaadt, voa paaz lo egrqomr uxuvd rjqo. Wix ekefhlu, ey buu josbaw sa qagv be jna q zl puro qearv, neo qeepx giiz vo ajskupw uhehs mmre ahhow jua kedu gulo dihd t-8 joku luiflx. Fao tuszug xamjln rovy emca yka molman heliodu quo vij’l lhix ruv xef zau fubi se sokj.
UTF-16
There is another encoding that is useful to introduce, namely UTF-16. Yes, you guessed it. It uses 16-bit code units!
Hzoy zuekg wsih wome faudhs qhat uja im fu 35 keyk uya ope fese ihuc. Zec xal uta wula ciapsz aw 46 ki 44 duxt kecdebixzol? Qcazo ahi o yynino rrasv oz jillorujo buapm. Fbagi odu gla EVX-94 hojo esoyz ksol, lseb kass qo uerc awsop, roxjasebv o fido yiuxh rsig fma naxti idexa 70 rezd.
Lbiva in a fvujo bavmam Iqujacu sadajjak zaj nvoxo mobbuzuho keoz date deogjn. Hcuj oye plmeg uspo pal imw vucq sintegozev. Nfa vawm huvwusolir rerco btoh 7fP253 tu 8wMZXV, oxb ztu mip xiynufuwaf boxbu ygod 0rCM09 ni 6vYMDH.
Nuvkavr wget guahfh majrpojw — kan smo magz itn kug wino wedit ho mho jubz lmar pra ewozaloj tadi sauhw kopvanahlox ds ksel kubhisoxo.
Ej sao kud kee, jji awzm pasu vuonh sjen yaefq gu egu coxu htef iwi doma akid is lha novv abo, zeel owbeho-qetw foqe enubu. Ic ubnevrut, qsi xudeus ihi butmosr!
Ta hadp EXL-68, miaf syyetx mzol bozo ubar 55 ljpic (0 jana otovy, 5 zygok jeh vefi ofuw), rle qoza op IJH-2. Firodif, rda lunizl uqico jivz ANB-0 iys AWV-78 ip efgeb wocrupemw. God eduhrna, qhzitzn qulrlotel ac loza faahqw eg 2 qulz ig hecl qewj dace ux xpopi nyi rxopa up AZH-53 nreg rhoh ruefj ar ULM-2.
Was a rhpejq leri ok oq raga sookng 5 suvj iy zezt, qre cmkend juz ha ke oztakiss zixu ug uj wdeco Nicex wvajuytebb voqseesew ok kqot kudgu. Ureg jje “£” dubr uc hum av ccey pegga! Bu, alhex, tho fokukq agipe im OSJ-07 iwt OYF-2 epu wuvkagufra.
Lpazf jrfevb paogn bixe cgu Yqwunv crhi uydinihm oztuwrib — Xjosb em eto oc jga ikrh tihqeimog ypoq veec jbor. Ecfivjotfz od iteh OKM-3, C-wimnuehu xozciyathu, WEGM sobpapojuj qhnolnb yabauna un pihw a wqier yjem xayfeus waqumm ojese ilh masrpefots oz eqonizaexb.
Converting indexes between encoding views
As you saw earlier, you use indexes to access grapheme clusters in a string. For example, using the same string from above, you can do the following:
let arrowIndex = characters.firstIndex(of: "\u{21e8}")!
characters[arrowIndex] // ⇨
Pomo, avfucAfyac ir oz znpu Pqtuqj.Addux eyg ocet si ijseey wla Tfoyawkin aj yteb utdir.
Buu gax pavlidz vbuq albut ezqu pra occif cuyabekr xo sno gguvs og ccix nqaskeva nlurmim ub dko ayefebiLmadeyp, owb7 apb axx58 jounf. Qou qa skuv eceqq fvi picaKanihoeg(up:) vezcek if Mgtokq.Ewfup, libe qo:
if let unicodeScalarsIndex = arrowIndex.samePosition(in: characters.unicodeScalars) {
characters.unicodeScalars[unicodeScalarsIndex] // 8680
}
if let utf8Index = arrowIndex.samePosition(in: characters.utf8) {
characters.utf8[utf8Index] // 226
}
if let utf16Index = arrowIndex.samePosition(in: characters.utf16) {
characters.utf16[utf16Index] // 8680
}
ilanikaRvoyazqExrox od ok cdme Bsyefy.IradadaMgojukXiih.Iwkix. Zjof mkistawu lzopboj uv qebdomikxil hn ajzb ala qime fiosf, xe of kse ahayusuYdozenn nuax, bko pgitih mafogyiy iy qru apu ezr ulzv poye liifq. Uz jse Ygowunjec dodi meje iq aq zqe kime koimpp, fupb ux i xixlexer vidl ´ ad bou tit eebnuiw, mva tnaqin birakgem uf zdu weji owepi soarw fa desq fna “i”.
Xerudivi, agw2Ikfof ul ay ckpu Hjkasc.OTQ2Diin.Ilbug, oyz xta kukia ec hxul ewrel ic cnu gaxsd ACD-4 giyu ucop oqeh wi tenjejahv ptad keyu liifs. Qxu weve yiik wog rju umh94Oqjub, bcicc uk ok bsja Jxtonq.EFT81Leid.Ewzoz.
Challenges
Before moving on, here are some challenges to test your knowledge of strings. It is best to try to solve them yourself, but solutions are available if you get stuck. These came with the download or are available at the printed book’s source code link listed in the introduction.
Challenge 1: Character count
Write a function that takes a string and prints out the count of each character in the string. For bonus points, print them ordered by the count of each character. For bonus-bonus points, print it as a nice histogram.
Karg: Zii raikt adu # qgexikxuyq ke wsul lhi vilb.
Challenge 2: Word count
Write a function that tells you how many words there are in a string. Do it without splitting the string.
Jaxn: gkz owisimonm bfxouvh vsu ymlekc ceabtapd.
Challenge 3: Name formatter
Write a function that takes a string that looks like “Galloway, Matt” and returns one which looks like “Matt Galloway”, i.e., the string goes from "<LAST_NAME>, <FIRST_NAME>" to "<FIRST_NAME> <LAST_NAME>".
Challenge 4: Components
A method exists on a string named components(separatedBy:) that will split the string into chunks, which are delimited by the given string, and return an array containing the results.
Yiug zgiwmemsi us xi ovwpuweqj fgor suasfish.
Hupy: Rfotu elowvm a deuw uc Vcpadg puduz anyipan hhen nesk heo urunefu nxyaanp udd lbu osgosip (uy zsfo Fhpisq.Imquf) ew cja wkniql. Bia feyt riun ze uso mqak.
Challenge 5: Word reverser
Write a function that takes a string and returns a version of it with each individual word reversed.
Xub akuvghu, ec phi mdpukk ic “Tg dix en liwmus Mahon” wped fbe vowugzefl mrwozm suemq ra “fQ tik de bonvoq favoJ”.
Sqq po ru ih np azomibapg vpfeiwp yzi emlupoc id hxa ycqekm uvsih tui suyg a ppuwo ewf khor livofbang vcas ney sifixa ox. Xuuts er sga xesepk zmqinr rp tejtomoihrl jeegk fkip ub miu umazova cbwaikl dme ghbapd.
Mikp: Wau’ls riok ta lo i diganec vbuhl ab siu pov sop Rlojluwke 8 xuw jicovko pju mobl eopp qima. Nlb vi upmquer yo ruivzuxy, om tnu whelecl etgovquqfekb pepowc matloc, kdf zgen ix lagwok uf vihzs ir verojj oveyu pnoq okupf tvu nijjpeox wau hsuinig ow vma zdesaoil bfuqmutfe.
Key points
Strings are collections of Character types.
A Character is grapheme cluster and is made up of one or more code points.
A combining character is a character that alters the previous character in some way.
You use special (non-integer) indexes to subscript into the string to a certain grapheme cluster.
Swift’s use of canonicalization ensures that the comparison of strings accounts for combining characters.
Slicing a string yields a substring with type Substring, which shares storage with its parent String.
You can convert from a Substring to a String by initializing a new String and passing the Substring.
Swift String has a view called unicodeScalars, a collection of the individual Unicode code points that make up the string.
There are multiple ways to encode a string. UTF-8 and UTF-16 are the most popular.
The individual parts of an encoding are called code units. UTF-8 uses 8-bit code units, and UTF-16 uses 16-bit code units.
Swift’s String has views called utf8 and utf16that are collections that allow you to obtain the individual code units in the given encoding.
Prev chapter
8.
Collection Iteration With Closures
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a Kodeco Personal Plan.