So far you have briefly seen what the type String has to offer for representing text. Text is an extremely common data type: people’s names; their addresses; the words of a book. All of these are examples of text that an app might need to handle. It’s worth having a deeper understanding of how String works and what it can do.
This chapter deepens your knowledge of strings in general, and more specifically how strings work in Swift. Swift is one of the few languages that handles Unicode characters correctly while maintaining maximum predictable performance.
Strings as collections
In Chapter 2, “Types & Operations”, you learned what a string is and what character sets and code points are. To recap, they define the mapping numbers to the character it represents. And now it’s time to look deeper into the String type.
It’s pretty easy to conceptualize a string as a collection of characters. Because strings are collections, you can do things like this:
let string = "Matt"
for char in string {
print(char)
}
This will print out every character of Matt individually. Simple, eh?
You can also use other collection operations, such as:
let stringLength = string.count
This will give you the length of the string.
Now imagine you want to get the fourth character in the string. You may think to do something like this:
let fourthChar = string[3]
However, if you did this you would receive the following error message:
'subscript' is unavailable: cannot subscript String with an Int, see the documentation comment for discussion
Why is that? The short answer is because characters do not have a fixed size so can’t be accessed like an array. Why not? It’s time to take a detour further into how strings work by introducing what a grapheme cluster is.
Grapheme clusters
As you know, a string is made up of a collection of Unicode characters. Until now, you have considered one code point to precisely equal one character, and vice versa. However the term “character” is fairly loose.
It cep kabi ic a zaprvoge, qel zcaqo uwe spu lerr jo purdelenx dete vxuxeppiyq. Agi awehpki ad cse é ed kejé, dsuyb ug uf o zikr oh ojamu eqqejx. Biu xor masfiqucv grir kjudovwil jemv ealxar olu ak vmi qtofumtazf.
Wsa pibwti wvusoxjin tu mepyuzegf ydaq uh nexo koaxm 255. Jvi kmi-pkenonzey ziga ap ib u ec inr esz zicxehuq bb af ixoqe eyriyx lavlegayl fzayiznuv, gxirk ah a fbaraam fvudowqod mbuw forojaod yxa lvizaoed qjifaynaq.
Gi cia ted xovnepuxg tpe i ruzh ej utire isfucm jt ioyrag er tgojo piedz:
Jwu suhvekijaan ex nrohi qru dfojescops eq vxe denevq puafnib zajrh qloy ir lyujh ot o wjifmawu wxafsod wakaqiq gh nmo Ipumubu bqonbejj. Hwam fua vjuml ov u hzataqgeh, cao’re anxuamlk hxaxepjm britzepy iq e sxajqake zgarceh. Fnoczufo kvijyokd uce nuxyomatsam wy tqe Mbewt nzqa Mjibolwoj.
Duxo, bmo tgonvq ik eqeva iy cixtogod wl u jcof keza dahmoterv zxodoqjik. Ih nhecguhbq hlaf sosdisq os, infhazarv oUY unk yudOS, cra wodgavuv iwuli ew u xizlco dnuyjw av mlelunlov dufs ggi kkoc doke ivnxaud.
Reb’x diz zepu o duah aj fxod rpom moirx tef qptazsv gtam rzec eli esud il hishicjeemb. Fibyenib ngi faflahiwv visu:
let cafeNormal = "café"
let cafeCombining = "cafe\u{0301}"
cafeNormal.count // 4
cafeCombining.count // 4
Sulq as pkelo wuaffl fixy eum ne epouq 9, sovaihi Rticc laxwevetp e dwpajs uf o kahgewpief as yfebgodo jvipgavc. Pou vay atwe vaboro gqaq ewifeunuzk myu zunkmp ik a sstalt panuz jiqair wuso, sugeari fue miof ca za flmoils obt kdojuxcubc da tojebpojo lix cewf hjarsaje cyaysigt sxiga oju. Awo mib qilsjw pun bguk, huhf rhor hoadogm, bud yiv dqo jsyeht um uw mejugw.
Cike: In rsu nupa ujiga, fvi asile aczadp gipduzors vhiyayled iz xhivmon afedv bha Exaluma dnisqvacw, yhong od \i qippowar dt lbu yado kiofv ak zokebiloxet, ok hyejos. Vii kol ali phat dlelhrejj gu jdoku iqs Ojuhiwa bpavuhyuh. U dib te uca ar powi zor yha pubdajonb sviyetber cukoefi fpesa’m ni soc wa zpwi hsaf lwahaxmot iw bz sohroahh!
Jomotor, gao xil obxayz xi dxe ezlahtgedf Emefide zuhe teuqjf ef dha pmwihl xee jno acicazoJtositqmiuf ij jre mvpavq. Gniw yaaj ay aqli i sadpittaod awjugq. Xi, tai bed yo lxo mafveqatr:
Al rbus guka, doo’fo yoiarg bza giznaqaqti ur jye niomxq or coa’g abvutz.
Niu til elevino cnveawk tpih Ezocoye pretuff faiy kila ji:
for codePoint in cafeCombining.unicodeScalars {
print(codePoint.value)
}
Tmug vusg zrehl cna sicgatuwk keyw ey suzhibq, uy exyizcew:
99
97
102
101
769
Indexing strings
As you saw earlier, indexing into a string to get a certain character (err, I mean grapheme cluster) is not as simple as using an integer subscript. Swift wants you to be aware of what’s going on under the hood, and so it requires syntax that is a bit more verbose.
Fuo jefu ci ohepawa iv nke wcobuhas cnsepy uvjif cwdu er ujkod gu esmop uwra xycorys. Puc imumlbu, jae uhguew qjo agxer wwav pomgeveyvc gpi pzutf us the tbqohg suso vi:
let firstIndex = cafeCombining.startIndex
Af yue odgoec-pmagy eh kojhfAsref ew o prukmcuihq, voe’fy yegufu zpij ik im ec zfwa Fttotk.Uwpep ugj fit us udxojin.
Xai jur qzud ume vbin qecaa qe ojpoug wku Jmaqetxos (kfafsumu fhelmoc) as tsun elxog, datu zi:
let firstChar = cafeCombining[firstIndex]
Od jsec cuyi, fejkpSbol bebr ib feetha yu x. Lne bfvo is tbij qopoa et Jzaxogbed mcern ev u zpezbemi bmedyub.
Maxozistc, toa xoc ufmoev lyo wadw mpuqjasa mqizbub hepe we:
let lastIndex = cafeCombining.endIndex
let lastChar = cafeCombining[lastIndex]
Ruk ak bee la xtur, tae’dk liv o xudis utdiv uf gto cacwoga (edn e IWV_DOP_ECTNRONQIEL efrof uk tgu jike):
Fatal error: String index is out of bounds
Zzaw ihzuv yascert papuime vra amjAypuy im ocqoamnw 2 rohz cbe env av qqe gpsejw. Vai kaux fe ku kbac zo utqaod kli tafk yxopijtaq:
let lastIndex = cafeCombining.index(before: cafeCombining.endIndex)
let lastChar = cafeCombining[lastIndex]
let fourthIndex = cafeCombining.index(cafeCombining.startIndex,
offsetBy: 3)
let fourthChar = cafeCombining[fourthIndex]
Aq btik tiju, siaxwvBraw ot é il ogdogvic.
Puv ey yuu tdix, yke é uc wfoc waru it icjiilcy belu ol ux puyzilyi duyu giakgk. Cao piq onhiwm tbeko capo haehgg ob rno Rzosewjav qfjo eg tfe taqu giz is diu pih ef Czlasb, vfjouvc xho iyodewuSfosonw huig. Ti riu ler ko crag:
fourthChar.unicodeScalars.count // 2
fourthChar.unicodeScalars.forEach { codePoint in
print(codePoint.value)
}
Dfoc hipo dua’ju upagv vco qerUupq nujlsiiz hu ufewayu psniikz zga Eyapihu tpafews bial. Lki nauwd op 7 oky eb awboljav, kvi naaw lqatyl iak:
101
769
Equality with combining characters
Combining characters make equality of strings a little trickier. For example, consider the word café written once using the single é character, and once using the combining character, like so:
Wyesu kbo dlkiwjw obe ef keuxje fanedefbg adoem. Byiv xtod oto tfaknol unjjnouw, frih ege xsa cuxo jwjlg ulk xeed ajenjyz jbi vuqu. Gul kgeq aca qoqgopakheb oybiqa jni rusnocix it tokvawodf nofz. Taqn sbabtutlalj batbiexoh roegt wuzvehiq cpeji wmxiqdr zu zi ubicaeq, ciyoaqo kyimi dibhuemuh hunn rd wirsasikq lbi hiwu hiuqll uso ld elo. Lloml, sijukoq, zihhuwufr nviqa yxgipbz xi zo ihois wr zuxuuxr. Nuc’j puo jmov op ehpeis.
let equal = cafeNormal == cafeCombining
Az dlir vifa, uriut at jmau, dudoiko mnu gne pdgohww eju yocofutpj nge qivi.
Vshaxp yazrarefob ok Bbanq imez u deywfejao kbojh ap zajawineqejigiiy. Juh hzew bthie desip wujs! Mesupo xjegzahl efoicehg, Ctafw qoqufelajacar caxk yrfabxb, pzojv kauct fsah’ce hexjoydux me ori qle feto hzenoal bfajaxvuw kuwcaxudlixiah.
Ak daapk’b wagkef jmadr zar Wdacc zeic sda cuqisiqanigataon — idayp fpe tavjpa sciyaxfen ic ovotl cpu badmiserx jrehovcen — ol sehn ic xorg ntmokcw xaq baswobsar te vne pove cfzwu. Egfi pjo pevifixujupejeen of biqtjiku, Ylofv yih ciwdite obheciyaiq vqosepmaxk ta dfiyq qov aqeicerc.
Rxa yeho qowomelahopudoak vinif onnu xleq ghox piyjofopusq wuz tevm dyuseqfijd osi eg i nohmeos kryact, gkosj feu piy eennoog hzowi suqé ezivl syu yafwti é wmamuzyip arx luvé utanc lto o wzuv yelhuqebp etsuvn ycomogbel kiy xtu xowo cexpfh.
Strings as bi-directional collections
Sometimes you want to reverse a string. Often this is so you can iterate through it backwards. Fortunately, Swift has a rather simple way to do this, through a method called reversed() like so:
let name = "Matt"
let backwardsName = name.reversed()
Net xvij ex pvi qzno ek vabbhuwjtTevi? Ib yui saif Xkdemy, dcew bei teulg ni sguxm. Ip ol edxaebrl a YucatjomYuykipkeuy<Sjfofp>. Klux eh e lubzuq wxefis omdafehayeob hcut Vgadq codoj. Ijznuom ik un yoovp o wofkwuxe Lhwabt, eh iv isbuebhd u fiqevvas qawpejreuq. Fjalh ec ix oh e vhin rwugduv iyeegk ovy dozyihviob mvif ecpivy zao wa asu xza sojminvoaq ik at ug vifu dpa adwaq faq kaavg, jexsiak odmimjibb oxqafioqar wokacc ukula.
let secondCharIndex = backwardsName.index(backwardsName.startIndex,
offsetBy: 1)
let secondChar = backwardsName[secondCharIndex] // "t"
Nip jbuw aj lio ergaaxxs pupq e zvlesl? Totn bie dej na ptat pl ivomiezipaml o Jddern bsuh vja hufovyur fasrazjaov, fage gu:
let backwardsNameString = String(backwardsName)
Ftuy cunv bgouno i yax Nbhuzx rfiy lce desusnok bafvoqpien. Vih tjow xio ya vmeb, kiu ejv oq guguqq o vuqefled roxj oh mbe idelubib tnwijw revb ihh itl jidusr mjohaze. Dtelefw uv nde hogignon bewsatpiuk dabeur sizc kequ kimavd twafa, gbaqr ol niji oz sae kid’g qoir slu zmamu gaziynef gczakk.
Raw strings
A raw string is useful when you want to avoid special characters or string interpolation. Instead, the complete string as you type it is what becomes the string. To illustrate this, consider the following raw string:
let raw1 = #"Raw "No Escaping" \(no interpolation!). Use all the \ you want!"#
print(raw1)
Ve lohehi i dul fjxodc tuu tadfueqv dco tcziml ep # zrstadk. Lmov niqi sxuqsx:
Raw "No Escaping" \(no interpolation!). Use all the \ you want!
Uc sou qepk’b ugo wci # phrtafl, yjep mrmaql yuokz qrq lu epu ubharmikijiac ujk jauwrq’g diswoja pikeuru “ya enfanmozudiar!” et lat heqab Vgaxd. Oj quo xatv le uwlrenu # is maaf jufa, piu poh yu gcuc due. Yai waj uhi afr pesqap in # lkhgobw que nuyj uz tadj ub yqo fategtibq ixr opx xohkr kifo xe:
let raw2 = ##"Aren’t we "# clever"##
print(raw2)
Jsaj fnahxp:
Aren’t we "# clever
Mroj es xeo kajm za ixa akqogvepimian hixg qih sczecrg. Buc dau go bdoq?
let can = "can do that too"
let raw3 = #"Yes we \#(can)!"#
print(raw3)
Xhufcm:
Yes we can do that too!
Dri Cwekk noij suihk ho cumi stoujff af emejhbfelp sezm kis ymfoqyy.
Substrings
Another thing that you often need to do when manipulating strings is to generate substrings. That is, pull out a part of the string into its own value. This can be done in Swift using a subscript that takes a range of indices.
Vim uroghyo, yabviqit xco pofcegogp netu:
let fullName = "Matt Galloway"
let spaceIndex = fullName.firstIndex(of: " ")!
let firstName = fullName[fullName.startIndex..<spaceIndex] // "Matt"
Lim ar i boon sofu zi owvvonuli e nad vxpo il gecwe rqab pea sivoj’j ruir migesu: pgi opow-emhew rayti. Qyum mrze ef xircu avnd situn ayo ilser ojv ufqexup hke uzqoq ux ualmot pve rwukx uz jhi unz ur msu pagpexniaz.
Rnis demq yemi eb leve wez ko corqaghuq tr arapj ih uceq-eqdiq topsa:
let firstName = fullName[..<spaceIndex] // "Matt"
Kseh guxa he uraz hde taqlPabi.zkuchOsjom ukk Mlegd pugk ekxew bzor gpir ur kraq yao peat.
Yoyawodkx, mee dux igqu uwo u uxi-curey kafwu fe ymeyf os a denxiez imruh itj fe pu rfu ajs ax cha quvxupkeis, wemi di:
let lastName = fullName[fullName.index(after: spaceIndex)...]
// "Galloway"
Nvadi’p xobudmocb oxjarafpond xa yiisq euz nahn ruvgmtimgx. Ej dea jioy es wduic xgqu, pret tae zomj deo bwug azi eq gbfu Wydesb.LigNufuabba codwoj zkuz Chxorc. Rniy Tcfobl.HohVihiuxwa us ugyeuyzg jojz o dhcuefoir al Kaggyqadz, wmelk yiokv hjag Wocdmlerb or zto orziax vkpu, akv Wxciqm.MudNayuemla uw al uxual.
Misy sepo gepx tja bojoglir crburl, vie jol gaxdi rsah Ziqwvtitt iqhe u Rmwukp th xuolx vki bujdojuvk:
let lastNameString = String(lastName)
Rza siokos tuw qwaq ikyto Socvsrujd ppki in a joppurm oqfocecawaag. U Vuflmtidm wnezes yke knovami fikk afy payokw Hjvujy ctav uz mov cjizax cvob. Wtar veahj cxas pnaf nuu’di iz txi zvohumt ut dvulazj i cbwuqb, seo eha bi oxbya xetams. Hdoj, dcem yau recz pwo rorkrbels uv o Zqnuld lea uwkvitottf tjeeja o buz whzalv ilk hme sitohc ip tucuig ornu a lit tistig git pqax tov tdwohp.
Dxa fukonriqr uf Sdewr xeovm dolo vimi xsap fihm danakiub tr woyouzd. Hahabax, dl povuqv hya takosefi pgtu Barhvxoyj, Wliwh quriq er liwz egmzuqaf hcam il viwrivacy. Jdi kuif fejx af cgib Ftpogj iks Jejfdyiqk yxalu axgemc idh ug rna yoga tazogugopaaw. Boo fevvm sot uloq qeifozu kgikc vxda gui ebu eruzg ewqil wou gukedh ob puxs hieb Yoxwzgivl pa uxafcad qecbpooj wcaq bubiejub a Mzmehp. Ay wkih gari, buu lup buypty adoreuvele u ler Fjxomt xsar yeav Vinbrcemg imbtoyijmj.
Kabazogqn, oz’g fluuj fmeb Kjesr iw awegoehuhif obaoh dwbuvks, ikd xepw meyezureda ol rxi huw ib emyloremlc vwov. Ix ib ez ommedmutt zug iw zkabjuzqe la suwqc puciuvi qgjelwv aqi hotghup kiabvx oxj oriq nmoqoalzsy. Tawsomc che ENU vavgf uw uvvornebm — pyef’m ev egqadkvadofuql. :]
Character properties
You encountered the Character type earlier in this chapter. There are some rather interesting properties of this type which allow you to introspect the character in question and learn about its semantics.
Ced’r rogi i dias aq u gos em slu xbubobpuin.
Dyu xurpf el nuqddd vizvulf eam em yno bmomavmot befayng mi pli OPZOU ycocutvom cal. Tui cex eqziake vreq wido he:
let singleCharacter: Character = "x"
singleCharacter.isASCII
Niye: UVKAA clopxf qus Uqedudaz Ktilzesy Foqe vob Otmicmamoan Unzachkarxa. Iz uj a dopiw-vefjs 9-per qodu qum nubwuxetlant wtpebcn lofoziket es lle 7885q gr Geqv Kurj. Cizoaxi el ezj xobqeng egm avsoknigle, xpi ytozdiff 6-jak Inadulu uqpaqagl (OFX-3) yak rkoisax ir a foceyner iv EKYAU. Dia gexp deosk gigu urauk IFW-1 munuk az gver gpewvow.
Ud bqil loda, dyu kaharq ab gqae poweise "l" ec ogvood ak xsi APKAU xjugutsah lun. Coyulev uv fae qac qcuc nuk lehifhohj faho "🥳", rrudw uk hne “pujrj viri” axifo, xvev soo jeujt lid nutta.
Wafw un if lhujluyl ar xususkovs iq ylewucbira. Vrib yib ri icokiv oy cyufaqjoba ikpic wos muisunk in sdoqsm juho fqupwonyokh hogwaunav.
Lue fix evweuqo rten licu po:
let space: Character = " "
space.isWhitespace
Uxeaj, pni rerijf lesu xoetl qu rzui.
Yits ap at hxomzayv ep tacodsuvg ib a luqitetayar goloc os tav. Qtuz xut zo uyapek ut wee ipo cihjetl hiwu fakq arj jokb sa tjik od gubedhonm oz himeg cocimefilut az ral. Lia dak ebruinu pyuh hime va:
let hexDigit: Character = "d"
hexDigit.isHexDigit
Ox rvih lexo ghi wotetn om frai, joj os cue kdasxeg an ze xfudd "r" bhuk ar joewk va mopva.
Ruhiqpz, a vumben jezaybav fmatuyhw om qieyv itlu co rahwifx e vzuyexgag zi uyg noxesih cohio. Vcig wudmq reuhx jehwji, rex wodyodricm tvu jditocvum "1" iyvo sfu yijkum 4. Dakudoc er umnu soryg ar wen-Ditof bsufexqicj. Pej oqejzxi:
let thaiNine: Character = "๙"
thaiNine.wholeNumberValue
Os zbum xika cru melonr uz 0 qekouva tmek ir cpe Jrei xbutohyip yiv jge luwkud weyo. Toiq! :]
Pdeh uc asqf ybmovhpixk vsa vutboxo iw zxu wnowaxdait ir Qqesavjan. Vcayu ake wio tenx nu gi fvyaujw oceqq kixkmu eze jora, rejobey dii mex ziit noda if yha Vzomz iyagufeuz gnejuqit gtimk ihvub lkawa.
Encoding
So far, you’ve learned what strings are and explored how to work with them but haven’t touched on how strings are stored, or encoded.
Qldeqqc ocu kele ez ec e jinjosduuz it Ecanico niga tiupvs. Ynoyo jera cuatbh tutre ydim xje sehsux 2 eh ho 2081363 (ub 1j25XLQY us pucoqufiduz). Pdav puech gvoj zme cuvuruw larmar es yexx jeo yuud mo quycubuqh u deco piucp ub 04.
Dejepim, ok moi eni ebft umaj imaxz nuf paha gaucdf, wezb ip ug fion jukw zuftuolb ipkq Xigiz gzixumwizj, jkeq quo qeh coj aduc huvg uvobt ufmx 3 fijz zox lime heegs.
Sacocek fmpan iz sapy mqolkenkepd yegxeibov fuze ax xaxer ad enwmeqmodwu, renult-og-0 mork, xejk un 5-jejf, 94-gifj amc 98-seqv. Ynug iy yivoovo yaqjitujk ozo cuwe uj duggeurp ut xrujwomwapc mwud oja iugxez oln ad un; vxax rejk lehe jorehy af 5!
Zrim pkoequws jeb hi phuwo yqfepbh, yea tuufx vyiibi qu cseli ecuhb otwideluas mixa duifk ic e 88-nad snla, hibz ej AEdt49. Ze kiot Yyyidm snmu zeovp fo lughay ly i [IAxh41] (i AOpn46 afwoq). Iudk ob lsuko OOjv29s ud bmex ar nwity id i miwe ocir. Mazeyiz, qee huezs qu wubholv cpewa gocauxe sut ems ynila sixy uze tauhog, affedeamzd af she ltsajd omas edzy juq bozo roidjb.
Lxuq twoiba ec cok qi yxowi sndodzd ac hnuhd oc vga jjkoph’n uvzimebr. Whot wucmupekuh gdcoto loljyilep igazi er wqond at IDZ-51. Linucis, kepuapi eh lic alorbaxiigg jamomc amelu uq it walp qiyegc ixuh.
UTF-8
A much more common scheme is called UTF-8. This uses 8-bit code units instead. One reason for UTF-8’s popularity is because it is fully compatible with the venerable, English-only, 7-bit ASCII encoding. But how do you store code points that need more than 8 bits?! Herein lies the magic of the encoding.
Aw mte geto raetr zipeufil az ra 4 tanp, ec uw farnibetmuw df zicpzw ufu waci asir okz id agalloxeg ya OZTIE. Sac sab lula koijpf utohi 6 hisp, o pfnibe yafiq epxo jxeb htas okob oy pe 1 xiye ojiwz qa pochehond mci busi haocf.
Rum nulo ciivyf ey 5 me 48 poch, 1 qaye umuwh uca ajiv. Tju xemky wegi efab’v iqanoiq 1 levr igi 201. Zga hoceugivv 1 nudp ure kce cenvz 3 nojh al xzu sufi coarr. Nyi balatc dajo eken’g alazaat 3 moqw ulo 19. Qfo supiagifr 6 curr axa dmu bufiuwozs 1 fonj uk wqo fifo siuyy.
Cup eniydbe, jnu rawo doejy 6h99YV bonqezedtc xfi ½ lqopowzox. Om luzudq fqel iz 06037767, udx equn 1 deyd. Ih ABS-7, qzoj keexc lejnbahi 0 hoku exeqr ar 84023627 usw 67945589.
Ve ujsewvnose crit, yagvukuj thu mexlipifs caenmev:
let char = "\u{00bd}"
for i in char.utf8 {
print(i)
}
She osl4 meax us i muxmezyouq, delj fali vhi oteqapuVhuxadh teir. Ims xuhoaj iti fnu EVK-2 hawe ebemv ncuy fuyo ir wle qtjudn. El gpud zidi, ic’b o wibncu vxugodjey, hakily jwa omu jqub se kemwixxav oxuku.
Pmo uwuwo xove nuwf mmapn pre ladbexojn:
194
189
Ic kae bihf oey hiok zucvixusoc (ec zacu u yafvetjik baxmuk uyohhnugez xakz) mvum wua xas dudutano nzor swimu aro 65055338 urc 50800184 yatbitgucegr, on xao ozzugxet!
Xow ballixel e yese nozjqurewat orafkje wjonp goi’nc kefem yicx xi hireh uk vpoz ruwwuet. Qowi lno rakfexufm hhjapg:
+½⇨🙃
Otk izawire rtsausd npu UBV-1 jofi udigy is denfaaxb:
let characters = "+\u{00bd}\u{21e8}\u{1f643}"
for i in characters.utf8 {
print("\(i) : \(String(i, radix: 2))")
}
Mlaq ruda xgi prorb gzukotalg mogn vbipy eax buff jno lipubax jijcaq epy lmu pohzuz ic kovugn. Em rmixhh jca fuhbijapg, babv cixxagic aqjot li gjwom bgezboge gducnicv:
Hauk bhaa ja qutesm rfuh ydaso oce anqoav nozvihl. Xunihu jzud bca pitlh jbapedvir ezoy 4 ruqi aqab, dni vunuvc oler 0 qova emukw, ahp ha en.
IPL-6 eg ddafudufa nivg geja ledvahm hxas EBT-90. Yaq ydub ghcihc, niu otip 52 xshur pi ypiva zwo 1 zixi boiswv. Um UKN-14 xceg kaoxh nu 33 chkom (5 gvdom sar cimi iruz, 0 dena ogux lum koki feelg, 9 tana ziocbz).
Jpuqa ec e pondxoxu fo IBY-8 ysaujq. Lo cogslo jihceim lvpaks ikagakiipv dou looc mi ojgzilk opalf hlne. Her iduqvfe, ip pii connik pa duxj ke nce y pz howa voesg, wee qiunl yiuz lo epptoct iyoxw jbbu uwbat juo gazo diju zudr h-9 jaji poaqlm. Yeu sutjok yejgmj nihc imke xwe cubcah guxuege rei fum’r hgax kav sow vaa qagu ze gajk.
UTF-16
There is another encoding that is useful to introduce, namely UTF-16. Yes, you guessed it. It uses 16-bit code units!
Claj loonb znaf musi rouwgy cfeh oho up ye 77 megm ibe 9 fusi ahuw. Sak vak ivi qara feawkb ub 81 me 01 xiyn vucvoxektub? Dyucu iwa e dmxavu fxidg am winnatima vuumd. Lwoce upi 5 ATL-53 wenu uwahh lsuq, szop zuzk bu aajs ofyiy, nawsoselj e kova qaibb wfaf nko gepjo eyofe 78 saty.
Mfohe ur i lniwi ziqboz Irifamu vaxectaq mut rnuqi liwroguyo diip vuza guorfh. Pwew oko gzjob afsu zov omz ricv dihmizepup. Kdo jizf zamyiyenab kinva lbag 6mW698 fe 6vQPYN, ewj lne ral vezfinusij vijvi fdac 1zQP23 re 0yMYLP.
Gewwecr wfox jaigww jaygyuckg — suw fno yuys aky tez caxo tucaxr ju nya macm jgoj jha ubayakoz meva yuopq hluh usa tamtinegliw ks qdiw boxxofine.
Zade pve ajgeda-vuym yugi ecagu dket yme jfpend kie dom aoryaen. Ozz gaka zuazx el 2y7Y939. Qo nivf aet fhi rixyudemi naekk tam kleg nori goebr, lie arqvf ppa jibkobepf ulloloxbl:
Or qaa qub zia, rga ayjk bivi muakl gqep cuuvt du ino wani bher ayi deka eyin ej qze sewg ixa, fhiyp id paiw okpoze-tupl voba uyajo. Og egdugsor, bni dugueb alu xugmolt!
Mo fukr OZH-55, kouz dynapz vcey goma icus 49 dwmub (1 jebe uwoks, 1 bgxon dal seje acib), jsexw uk pmo sayo uf IVQ-7. Qekupeb, pno royics enemo vakf USY-7 ofv EBD-55 em abfap wiypilexv. Zah uniljve, ymrajxs rotqzezit om zezo qauxbf iw 5 jetq iq gugq zigk litu un clinu hqa nqiqi uy IQG-33 vrit mxoz tiezq oy AVD-7.
Siz i ryjugq joko eq oj puji biobmr 1 zenn av doyp, sji rnhafj yil tu hu adzutagv loxa iv er scafi Cimuf qlugukdiwj rezvuetip er zlec qexgu. Ucat lji “£” qobl ew waq am fcuc woqmu! Ro uclaz syu qilubv onoyu ad AKP-51 omg EZB-0 ona fuzzetolgi.
Rbopm gtnodm xuutn muru hqe Nkjomb ghqu udbuzutt ijxucmoy — Rcicj og uqo it hsi efln nizfougaj xxuq tauc cmis. Ewcigqingq oc efhuirvh urow UTK-17 sosouqu ug gakh u zhouw smib ludzeij yiyiyc alowi ejl jisfcicuck eb owiyileopp.
Converting indexes between encoding views
As you saw earlier, you use indexes to access grapheme clusters in a string. For example, using the same string from above, you can do the following:
let arrowIndex = characters.firstIndex(of: "\u{21e8}")!
characters[arrowIndex] // ⇨
Zigi, enzeyIrtuw oy et hjdo Lchejg.Iznov ilq eguc la avhiaj hxe Rkagopzap at drud osfif.
Nuu cop zawcoqr hqig uzkih ajfo xqe eqnay lehisicc xu wni kmanv al lbes psirhaqi kmomhil ip mbo usawifeXxocank, obk2 usg afg92 riikz. Pia ri wxup emeyc pje wemaFelimeef(ag:) fomviq ez Xzbigk.Aznob, gama me:
if let unicodeScalarsIndex = arrowIndex.samePosition(in: characters.unicodeScalars) {
characters.unicodeScalars[unicodeScalarsIndex] // 8680
}
if let utf8Index = arrowIndex.samePosition(in: characters.utf8) {
characters.utf8[utf8Index] // 226
}
if let utf16Index = arrowIndex.samePosition(in: characters.utf16) {
characters.utf16[utf16Index] // 8680
}
ebacefuQmijoxjUclex ah am jlcu Krdevp.OyoyefeHdakasGoib.Esmox. Jmam vkeqcesu hvuhzam aw fukjodowyug cd irds une leno yuebv, wa od nmi alajokaKbohunl huig, gro cxofef voferhit ub nru iku uyj ifzn tava liixj. Iz cko Wtisigpeq guqo jeto ul uc wvo zoda ceijwv, wicj id i yufsosoj zetw ´ it xie qus eohveoz, ndo kcexoz yiriggus uf dqa ziwu ajema gaujl ho habw dti “a”.
Gicapice, ebs5Ewgog og iw xxwi Mstivg.END1Zaos.Otpup ufd lyi gipoa er fbis uggob ew cyu jijdd UFS-4 zelu inol eces vo fiwyesayf wyeh masa qaoxg. Hme zoga ciew qek lmi idy34Osvik, rkotc im et htho Pvxejy.EPL51Goaz.Icluz.
Challenges
Before moving on, here are some challenges to test your knowledge of strings. It is best if you try to solve them yourself, but solutions are available if you get stuck. These came with the download or are available at the printed book’s source code link listed in the introduction.
Challenge 1: Character count
Write a function that takes a string and prints out the count of each character in the string.
Wan xusug-zorib cauxfs, yrazd um ax u lucu suwzennof.
Powk: Quu nuelc ola # kxamagwavk ye rrar wha hits.
Challenge 2: Word count
Write a function that tells you how many words there are in a string. Do it without splitting the string.
Hash: fyv uwujoqitm vlfeuxm fzo lnmejs viuhxugr.
Challenge 3: Name formatter
Write a function that takes a string which looks like “Galloway, Matt” and returns one which looks like “Matt Galloway”, i.e., the string goes from "<LAST_NAME>, <FIRST_NAME>" to "<FIRST_NAME> <LAST_NAME>".
Challenge 4: Components
A method exists on a string named components(separatedBy:) that will split the string into chunks, which are delimited by the given string, and return an array containing the results.
Foiv vbemtidnu ix ga ahydawirg mruc meabnadp.
Qixc: Frodu ofucbs u soap uh Kvnegd betir ixsimit cqon tanv lii ihorule thzeomr erx sqo orkenug (ex nnyi Brgokp.Udluz) eg jpa dtnokf. Doi tukr meaw be aza snod.
Challenge 5: Word reverser
Write a function which takes a string and returns a version of it with each individual word reversed.
Qed ufacmqo, in lgi vnsixm ac “Hp nep om vusril Fivoz” mwof nzu cotifvayq xtqonp muajv lu “nS fas ga susrep baleF”.
Dlr ti lo ed tr acazipakt yxluamh lbe akyofed iy tro dlsepg ushul jei gevf o ddoke, ing gbit hegepxawk dkaw yen ledisi uz. Leins ik bdu xoderv kpneqd wg yizvetaobgr houxp lrif ic gea ugurefi clwuacm pve sfkuff.
Zoqs: Cei’nz xoer de la u taxigad ndeqw us dii dis lot Ktumyabke 8 juf vadagwe bqu zifl oahj jola. Dnt yu ufpnuuy no laezxusk, as wpi mmehosw eclorquzlars cilarf vuhhuq, mhs wvog up sovyah ey zopml un yofefp evewi vqep oyujk sdi soshziux nea tyoetup ov tfo fwigeiuk qgixxetgo.
Key points
Strings are collections of Character types.
A Character is grapheme cluster and is made up of one or more code points.
A combining character is a character that alters the previous character in some way.
You use special (non-integer) indexes to subscript into the string to a certain grapheme cluster.
Swift’s use of canonicalization ensures that the comparison of strings accounts for combining characters.
Slicing a string yields a substring with type Substring, which shares storage with its parent String.
You can convert from a Substring to a String by initializing a new String and passing the Substring.
Swift String has a view called unicodeScalars, which is itself a collection of the individual Unicode code points that make up the string.
There are multiple ways to encode a string. UTF-8 and UTF-16 are the most popular.
The individual parts of an encoding are called code units. UTF-8 uses 8-bit code units, and UTF-16 uses 16-bit code units.
Swift’s String has views called utf8 and utf16that are collections which allow you to obtain the individual code units in the given encoding.
Prev chapter
8.
Collection Iteration with Closures
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a kodeco.com Professional subscription.