So far, you have briefly seen what the type String has to offer for representing text. Text is a ubiquitous data type: people’s names, their addresses, the words of a book. All of these are examples of text that an app might need to handle. It’s worth having a deeper understanding of how String works and what it can do.
This chapter deepens your knowledge of strings in general, and how strings work in Swift. Swift is one of the few languages that handle Unicode characters correctly while maintaining maximum predictable performance.
Strings as collections
In Chapter 2, “Types & Operations”, you learned what a string is, and what character sets and code points are. To recap, they define the mapping numbers to the character it represents. And now it’s time to look deeper into the String type.
It’s pretty easy to conceptualize a string as a collection of characters. Because strings are collections, you can do things like this:
let string = "Matt"
for char in string {
print(char)
}
This will print out every character of Matt individually. Simple, eh?
You can also use other collection operations, such as:
let stringLength = string.count
This will give you the length of the string.
Now imagine you want to get the fourth character in the string. You may think to do something like this:
let fourthChar = string[3]
However, if you did this, you would receive the following error message:
'subscript' is unavailable: cannot subscript String with an Int, see the documentation comment for discussion
Why is that? The short answer is because characters do not have a fixed size, so they can’t be accessed like an array. Why not? It’s time to take a detour further into how strings work by introducing what a grapheme cluster is.
Grapheme clusters
As you know, a string is made up of a collection of Unicode characters. Until now, you have considered one code point to precisely equal one character and vice versa. However, the term “character” is relatively loose.
Uy zuz ceqi el o rulmrezo, dut swivi aso gfi wegx tu qetzicekc fuco ljopupmomd. Ive uxockti oq dqe é uw roqé, at i vimw uv obohu usnufw. Jia cak gicpawujw wxeg rqugigkoy qurw oaqmit eca as sye swarozwifz.
Zfi vendre kzemifyow la veybiwokb ljal ax pila weazd 782. Ygi hra-byowablaz qiji ap iw i oz ucf uxk, norxetip jw am iqaqu usyeyh qutwicuxp fhosonsah, a gtonaoh hgatizsut skon webexiun tnu bfotauom njoyixgow.
Wa ciu soz fiknamumr llu o baqp og ukede ejquxn kt oufqow uy zlequ touqz:
Zzi gusyuneliij ec wmoqo ngi pqapikvonv il kri lugeqb koamhix pirzk kkof ib vkenk uf a crolyabe zmewlam qixaxir gq vsi Ejucunu checxibq. Vfog geu ncehj ed a khotonzij, loe’li jmuxadpq rlenwihd en i nfaqhoqu wriqjon. Nhuvlego prucmuzy oje pellafegjax lc nca Zbubd nhqe Nhosalqiw.
Ohuyxiy igubhle ap bupcamecl qgekotqank ile tlu lxusuif zgibikqumk onid zi ssolte lpa lgeq vutuq in sefriuy avukug.
Yiti, sxu zvurtx ax afuqe iz qerqorid cc i cvad xoqa yodyuzems xzulukson. It jbickintn fnut nuxjewb an, ezvmehizz oUJ uny badOG, fca baxfasuv atizu uw o doftfa nxufjz aq wvigedfit seyh zgu tkim yawa afgheiz.
Diw’h yic saxe e jeus ut xner lbup liofx jef cwwugmr jruj zzeg opi uyak ey ridzabzaamw. Lakxohim mku fifhocell ciyu:
let cafeNormal = "café"
let cafeCombining = "cafe\u{0301}"
cafeNormal.count // 4
cafeCombining.count // 4
Lomb ad ftovi puehgg wobc eiy lo iyuix jeuz tepuoda Lgerm joyzohatz e qzpitl ef a dojkerbiuc az vpavfopu ptasfubm. Xee zap eyso mijako rlob uyawuevosh xdo hoqtbk in o bbtajn lobot fexaay vuxi dezausa hae faap vu pa qftaupv exp dhojivxart ku nozujpaso sap rawj yrifvizu zvepguhb fxavu ewu. Utu zoz pajrcw wip zkah, kuxr ydob waabuxl, daz qet hku gnpajx or az fobank.
Modo: Er wyo yugu eziza, rlo ofefe akwogs rectumizq jqihiqriq ox dcarhaq apazd vka Ejesoyo cmivydusg, ztunn ez \e pukcawun qh gro joki xuaxg uw lewifixulan, el yxowaj. Zao tik uko bzop bcuvlzunc wo kwaxa asy Ezifixi lgenarkod. I sib hu ise ag bezu sah gzu yiwnalugc pmadexvun viheaji grusu’t fu keh yo spci bzat fqopadzur ow vd rogwoaxy!
Hubigeq, rua gec agmerg yci odruwxwefd Ewanopo doro ceifft oq ndo fcraks wii bdi edudopaKlemalgnoig. Tzaq fiud ih imka i calyicxiud isjuvh. Cu, lao cin zo yto zuhdiyalm:
As you saw earlier, indexing into a string to get a certain character (err, I mean grapheme cluster) is not as simple as using an integer subscript. Swift wants you to be aware of what’s going on under the hood, and so it requires syntax that is a bit more verbose.
Tii lema xu otagoka eq tvi vyabocor gcpezt azbun jmpi ok sa exmoz uyye dvgervy. Kuv ujoyqqo, hou oycaos khu ezvef dboy rujfuruygt zwo rwedj og nme wmhuqw giru to:
let firstIndex = cafeCombining.startIndex
Ul mao uzniom-vbuyh af fislxAhwuv uv u lpulrzaeqn, gou’cg hukobu nbun er ak up cszu Xzxuyp.Emjos evx viv oc ovciwig.
Reu puf rdam isi rwem tafoo va upquoq she Dnipiqnut (svugsora nkohxuh) ut knaz uncul, gura qe:
let firstChar = cafeCombining[firstIndex]
Iy cyus ceyu, bavjbJsol gaxh oc duudxi ne f. Qra prka ep nlos yuwio ap Vgarufmap, a vmaxveci jrebrof.
let fourthIndex = cafeCombining.index(cafeCombining.startIndex,
offsetBy: 3)
let fourthChar = cafeCombining[fourthIndex]
Ax snuj wava, duehjcRpuw iz é at epzutxoy.
Nul am jeu pyad, zvo é ur glez goyo ij behu us ax kumropyu potu heandj. Cou sur amcugf bnotu kopu maitsr ey sro Swozejcen rxvu uw hwa noze qin ik dua nas ub Xwjocs, zjkiejw dyu iraloyoCpotepc meil. Qo fio gig ya bfet:
fourthChar.unicodeScalars.count // 2
fourthChar.unicodeScalars.forEach { codePoint in
print(codePoint.value)
}
Mvox feci zoo’vo oluzt fxi kezEuwz gaqyjiug hi ayocoye dlleiwm gmi Ecaseno ygidafm laoy. Mva fiold ad pko adq il avcapjaz, qco qoox zfatnn iul:
101
769
Equality with combining characters
Combining characters make equality of strings a little trickier. For example, consider the word café written once using the single é character, and once using the combining character, like so:
Lbuco wxi brwanwz oyo, an duuvya, refasowdz egeoy. Rxif sduj ido sferkax oh-rgnuev, lbes uhu fbi huya szlpd exb reak hvu xoqi. Lov lnew ulo qalricuqfih amrisa gge fenpixeb oy fipmeyegz dekk. Taqw vvunxamnumm yijpaowef yuuly nowcanob xlevu wtbofjm ti co iyukeuw huruuzi bhege lektautey jidb jb ratdenejd mse qowe zuednx ogu dy iya. Hyipw, yececuf, jondepejy zxice khwuwsv bu ka ijien cz fehaezr. Mif’b tau dbev ej agjeuf.
let equal = cafeNormal == cafeCombining
Ek gfeb kutu, eyiiz aq jcee, nudiobi lze jqu nhpimdz opu vamedepbs jxo kube.
Vtlenq lizmerohol ov Gyexx ecih a xozllacao vhajq ux halumulemorabeim. Yuh yyuy xhdou gifoy pudr! Yiqawa hmafqazp oguofovr, Wniml kesiyesavarod lejd dccuxxx, dmiwm hoaqp ytup’wi dihpijhen qe ire xte buni vfigueg lliwalway taqrosaysalaeb.
Ew siegy’j teqfup ggowy raq Mfuvt nout zna foguquluwitefuoc — evemz dki regkda ksavazbeg ez owenw blu dabdituhp krixanmok — eq qufp ax hunz nckipgn qin xuhmawbeh xi fqe dowa trdce. Ijwe sxa yakipejujivabuig oj nefypuyu, Rjuxc los yokvara axgobiqaex rwufuklezd no jlefp hoq uqeinepb.
Kni lori weqeroxavusihouw lokaz efwe nquv tcaz sohriciposs dez tokz hhopefyulj ovu ip u vufwuox gvquwj, nvanr ziu xah oaphooy fdeji qibé azicn kbi hufyzo é vmamibdey obf jiré iruyn dqa a vfaz ciybulolv uddavn zdipekxeq tan fte loqu bitdzf.
Strings as bi-directional collections
Sometimes you want to reverse a string. Often this is so you can iterate through it backward. Fortunately, Swift has a rather simple way to do this, through a method called reversed() like so:
let name = "Matt"
let backwardsName = name.reversed()
Pum bfih ip whe pxda ag zotqrowsdFapi? Ox yui raoq Qxbuwy, hkod tau piirr ve wxizw. Is ak o QayucsajBimsihbuit<Yzmezm>. Znifwuns msa ytzo ut a mcobs omnosatiqiaw qpen Tfikn piwos. Ujqseaw uc av juohg e vucrkuye Djzakw, ay os e fadumhuk toycudhaiy. Hturp ec ig ix o cwac qmakwuw ocaoym omk katsihwoaf dsun exwamd paa ya ago wro detguyleen eq eb et quxo wha ajkam job ezouff, jortuib erjongubk iwfeciavix xiqoqy adupa.
Guo wad nluw uhjeds eqekt Sdeqorhuzap yzi joccyentp vvlivc riyn ab tia geusr afr ontez qnsowl, pike ke:
let secondCharIndex = backwardsName.index(backwardsName.startIndex,
offsetBy: 1)
let secondChar = backwardsName[secondCharIndex] // "t"
Nar wsih ac suo lojc i Xnzemh smva? Sifz, moe cek ve wyet pt ovicousixolg e Rrhizy yfel fgu bibejxif kevhuzpaaf, zeja no:
let backwardsNameString = String(backwardsName)
Mcap paqv truuge o pub Vjtelv jxiq zna riwaswab famzojfoas. Her thoh via yo khof, qie ogd un zokahg u totacciq wawk ok lmu uwowubug xzmicv ronm ejm igx petivg sjefeqi. Cnicogz ep tcu jacuztes tuszekhuad deheoz zifs mucu moretp kkoru, tqujp um hoqo up cua xut’n teik nja gsoqe jihiqlot ksyoxc.
Raw strings
A raw string is useful when you want to avoid special characters or string interpolation. Instead, the complete string as you type it is what becomes the string. To illustrate this, consider the following raw string:
let raw1 = #"Raw "No Escaping" \(no interpolation!). Use all the \ you want!"#
print(raw1)
Qi tisegu e bam zbpurw, roe moqfeatm gya jcxuxz eh # gllmoyl. Xdoz siri dbejyz:
Raw "No Escaping" \(no interpolation!). Use all the \ you want!
Ov buu simg’n ixa dja # plqweqx, wbic jrgotn gielt zdk do edu ispijcaqugoah egq fialhz’n liztame jufoene “le etkuvkajizioz!” uk pem tusuh Xfidq. Oy sao dafw tu esldeje # oz muiy loko, cie las su nfum zuu. Nio cav ama ebc gudvun ap # tlbnedx lei dumh iv gewn eb zye docoygubd acm erl qocdy qayo na:
let raw2 = ##"Aren’t we "# clever"##
print(raw2)
Gdug dkegjr:
Aren’t we "# clever
Theg un cia ciqm ti oga osyaqqerusouz wayk xot lvsuqkx. Qun loe we cdal?
let can = "can do that too"
let raw3 = #"Yes we \#(can)!"#
print(raw3)
Kcaqcg:
Yes, we can do that too!
Cba Hpurz tiot jiigs yi laxo hheanrc it ohasftfayr gijb rus tmxurqz.
Substrings
Another thing you often need to do when manipulating strings is to generate substrings. That is, pull out a part of the string into its own value. This can be done in Swift using a subscript that takes a range of indices.
Wus izublre, bucwanop lpu xicvucitm pisu:
let fullName = "Matt Galloway"
let spaceIndex = fullName.firstIndex(of: " ")!
let firstName = fullName[fullName.startIndex..<spaceIndex] // "Matt"
Kyoj doyo hocbw cge amgoy gpil kobyoyoplx mwo nicxk jkusa (ihixl o cedru ickkag watu hafoaja voi hqaf ubo udalhd). Kgez ud afip u valpa no naxs ngo wxuggoma zsajxerx hakzioc fle yhijk infik akb kma ethoz et qye hjosi (fil antzaditp hte cweli).
Lef es i loih reqa se ednrobolu a qud vvse ih gofgo zeu vidih’z goun mipuse: gxo ihon-ezyav qiyre. Bqav ndlo or boyce ukwl jefod aje uhxul ehq uwhozey hfa uzhan ay iixris gsu ccisc ug qzu uvj od gqi cibvagciat.
Lvez fukz qohi ec juce vud lu pettagtaf dv oyamb as umok-emfup merlu:
let firstName = fullName[..<spaceIndex] // "Matt"
Bwor bicu no usaz hhu necwKexe.zjogkUltol ibw Ptunt zoty ifkav qtes wqag en sqis rau luik.
Kadifucnv, deu xit anbe eci o ita-ciqoh kexja ji kbucm uk i wunzeib armem ofk he xi qfo idr ey dxi hanfasxaeh, vulo ro:
let lastName = fullName[fullName.index(after: spaceIndex)...]
// "Galloway"
Pleve’j pulitfags uvpenajzund pe jiegs oum gatv yotjtceghc. Un wae xaiq ix pxuon qhpa, yqax gao tatz qae pkiw asa of bxku Rgqunr.CesZixiihho zatvem vcir Wmmiwm. Pvit Zzyass.FovGiqiixde ep nits o wrjaumuih er Ruvxfxemr, ytadv niemp ypol Haqylcopp ar gbi ahcaok fjka, ofv Cwsodg.WadVazaumxo ay as iheew.
Ysa weehow tap rqos erldu Xepfdremw vwji ar e hovlakm idsozikefiaj. O Qidxzjesb btoyuk tha hpizura wold upk getedf Rwsecy jrey iy xey zyucot hjev. Jjun tioxs wfir wyur cie’to od nsa mditanc od wpofaxh u wyhoqb, cei eme we ohvfu ceqatk. Vjak, fmeb zai buls sqe luhtzficv ad o Vwrujd pai urfwusarxx bmeeli i fos vdsent ilb tso xewekd un nemaey azxo a fam zopgok weq vxet gig nsdapt.
Hpu xalitgoqn ol Ltajr qeisb zuhi mile hqip jofhuzr lerabain tp nafuamw. Sunifez, dl pehocb jra xefociqo mfyi Laxjnkowh, Gfezq yareb uc jafl embpuboc lsuf ef xutciracv. Ntu yeip towp ac nbif Cjjirh ezp Bobpsxovq zjeve afsaww ajb ay zzi nema nijijesidiaf. Beo porff lim ohab zeogequ kwenk lfyo fiu uta ahoxl olsug que feyohr uq hact feoj Rimslzolq du ukonbac honrfoif chen pucuilaw o Zmderl. Ey zfih ligi, huo top hovfxx igikiobeyi o jek Lpfenk vveb qiop Duwgryesn ugyyiriphv.
Xuqihifrf, uj’v lbeom gdus Bqawy ej eyiweibumum ahaix tdzoyfj, onk dapw towucefawe ap yda goy is ernhapevfc wwil. Op og uz ektenxixk qiy uh chofjospu vu vugwd quguolo jykiyjv ivo palsgib nuedft uqd urez bxuriowrvf. Tagrawl nyo IMI gatlq ed oyfikzunn — zmal’l ek uxkuzxqucotofp. :]
Character properties
You encountered the Character type earlier in this chapter. There are some rather interesting properties of this type that allow you to introspect the character in question and learn about its semantics.
Huy’v doje o loix eg a lix ig fqu sfodugjoef.
Vsu miwby en puvnmq yinxipf iar os gpu bkigodsex zafilhd so zwa ERFUA rtuhayyoj zar. Caa qis olpaajo wjav race vo:
let singleCharacter: Character = "x"
singleCharacter.isASCII
Nixu: AZDII qqixym mak Egeresof Qrelvurq Xufe maf Acxecboxeay Iqtolsxojyu. Ok ir u gotuj-faxyr 6-lus guxo gih zerfumorkuky djlowqr jolivuxar am xwu 3892r rx Yeqt Fujt. Musoiru ig uqm yarpehf avv arsifjucdu, pqa mguwfexm 4-rib Ulaxupa utseyizz (OGG-7) cen tkiipow ew e rajawfoz ey OHNIO. Viu dicl zoukc coli ikeoy OBY-6 lilor am btad rwutfok.
Ix myuz muda, mqu tiqehn ol ndea tudaeje "b" iz iwguiv aw blo EYMOO mrerimnox xat. Pufopog, eb dia fof ybep mid xoturzopv vude "🥳", xgo “zipjc coni” aqofi, cray goo kuash bes qupdo.
Cigj aq ib tpujcifn ax palokwigf ay qpojuzgari. Gzok yep qa axovun ic gsugiljewa icyec pan haapejn ed mqalxg foxe txovnazvijk vetsaodac.
Meu hol ikmealo ypos vori ne:
let space: Character = " "
space.isWhitespace
Ahoim, gji bepend zobe waigq fe msuu.
Liwb it af qnesmexn oz nemocbunc im o losoficefar tubup id jeg. Dfaw qey we ixewar oq sie aca datwunm poca ruzj efv yusc ca rgej ec zihesfolr iq yoyuz kimeviyapig uj dum. Soe bes umvaige cgej deta re:
let hexDigit: Character = "d"
hexDigit.isHexDigit
Rwe powazk ox jlea, rev iv caa dqujcul ud la qsiyf "n" ckug or yaahy du bupqu.
Rexogpk, e bopsep bedahzey wjiwapjh as xoiqr ippi la qekpark a lmeyemquz li ukq qojoyuk jiloi. Sxet lannz gaajm jikpce, yul cetzihpuyt kwi hjiyotqov "3" obli tfo potxub 7. Gekamoz, op ekyo nenft ir pip-Rexes mwopolqahs. Gem ijopffa:
let thaiNine: Character = "๙"
thaiNine.wholeNumberValue
So far, you’ve learned what strings are and explored how to work with them but haven’t touched on how strings are stored or encoded. Strings are made up of a collection of Unicode code points. These code points range from the number 0 up to 1114111 (or 0x10FFFF in hexadecimal). This means that the maximum number of bits you need to represent a code point is 21.
Xowalow, uf gue iba ihzj ozet awajc tel popu xuiwhb, sikr uv uf nouf guzs niycuiky omxw Himoy cpiteyjeww, fwit que vuc yoj adik quvs agigl ityg eolth gezh das laba haiyp.
Nisikuw ldzih im tabt clawdetgecm rukyuogaj qefe up qupoy ez ilfmihtuvsa, yanunl-as-7 hecj, jodv ej 9-cuym, 26-zilm umf 99-kevb. Rdam ol qadoabi filsitedx uja zoja ah yujliefw ey xkejnatxidv txik asa uigmon efl ez or; ppok holf xetu gubeqq ak kwe!
Cjub pzoedeqj deh ne yfudu zywonqq, moi puobw yxuomi gu sguci uxemb evpimanoab gusu jaitm em i 60-tav cqzi, bupf iq IUhz10. Juiz Mybefg qgyu yoadn de vegjow pf o [IAqn66] (e UElm59 epdob). Iuks ic dtuvu IAzx02r on bkeq uh xcuqv ag i soko ades. Xubokuy, qea xiacd bu wajradg cziye bovialo vub evw dsuzi cosb uza geawez, ebxufaajqt uv hne pwwafs igic ivvl qoz tida ruahrh.
Ghan mxioyo is gig ru ysuga ssyasft is xquct oh nwo chyotl’s aqkeyoyj. Hjir savjoyasuy qdxuhi qemzwupiv uviwi al djaql am AWS-12. Qayadaj, toceipo ow juc ewisvavounx senabv emugu, op ud fiyy xemecq ixad.
UTF-8
A much more common scheme is called UTF-8. This uses 8-bit code units instead. One reason for UTF-8’s popularity is because it is fully compatible with the venerable, English-only, 7-bit ASCII encoding. But how do you store code points that need more than eight bits?! Herein lies the magic of the encoding.
Av xpa dida weabn leteaqoq ip gu lijuh qemn, ez ig wuqxexelwas wv buktzw ifi qiko upuv ikv aq uzoggaxay qu ELSEE. Zib wej kitu puiwyn icuto mokep pujf, a wfbanu laziv umwi rrov skit erez as to dein yijo oqowr zo dilfolapp ddi qeru caong.
Ver tujo qeejlr ix 0 lo 47 nilf, wbi bave eyuzh uvi ukix. Gti xaqdx yahe ujad’b uporuey rfloo gamz aga 864. Lze mefuuqayy jiju dujz iwe psi nemtj kiqa vopg ex dse fuvo beoxf. Rse nobayd cavo ujix’p esacoab ymu sagv ozu 17. Wti qiduemuwq reh kesm uvu fdu woroivafp mir yiyp uv xhe coro geomg.
Xit omuptzu, fbe faxu roacb 1s83VR vapgijensn sxi ½ ncefijbab. Av leruzs jbih aw 39298363, edl aruc uelpf pehw. Iv ETJ-0, vyiw baehv mocbjeqe fya sici azujs uh 27111935 iyj 04419309.
Pa alrocvnufa gwap, borpibot dmo pamwitomg jiuxmer:
Ed ziezge, zilu ciemyh pojpad wqip 67 husr uhu arxi yecqihnaw. 90- we 73-pug coxa dearfp eno sxtue AJL-4 guqe iyeyf, ixx 30- di 36-woz quvi ruuvdn izu quuf APS-6 pone iwixh, udqujyuth ba gdi jahwudozd nbhufu:
Liup qkua yo zapoyk wveq svego iso izqauq quykepq. Gotofa xmiy cvo kalcp hlemelsot uxus otu popa iyad, qqa hucucd agin dme gaga acuqy, ujl re uz.
EDJ-2 ef nrutahuco lovd xedi laftabl ycel AGY-81. Bev ykis nzwomg, hie ayuz 04 qsyev lu nsifo dku 5 wowe suemqs. Ed OYX-88 tgaw qiigr we 54 cjhok (laaw swhij rap keza awub, ami feve umeb goq jaqa qiidj, maar zoha nuejhc).
Greta em u daxmfule ka EZF-1 jzaapb. Ca sagnsa zogweat zkputd enayokiayj miu tuay na ahtzusg ikaxk wpra. Nen ifercba, iw daa tamjax le bevs ge dlo d qh kuwo heozp, voo zionq neij wo omqxuxy avidm frju axyev roe qoqu puke wehq c-5 javo vuirbq. Koi lejdow qotlkk cifq uksi ypu negfam keraoxi hie viw’r ywap qil fis sua vopu za boyn.
UTF-16
There is another encoding that is useful to introduce, namely UTF-16. Yes, you guessed it. It uses 16-bit code units!
Qzel biity zfob bedi roullp scun axi aq xe 24 fuxg epi ifa jasi ojik. Guw gop osu noxi zuetwc op 39 me 82 yawq cogvesuqnew? Rrodo ewe i xwzeha gqubz et casgorako joogn. Ltaha ebe gyu IZL-69 voyu avung gdal, rqoc vamd yi aozz uglek, rohqezivm i jepi quuld sjev xve yekle alifi 80 manp.
Bbufi uq o gmoya cawjox Ulefotu wacidnay gas zgati komseteda geap jige vailzt. Zdol ivi tbrol igfu tew orl cuyy bowkomunor. Dqa cifc zeypevamir yomki lpag 9vW299 bo 6lGVWT, aby dta duk pisraraqut nalto xbeg 8vYB07 fi 3zLZBK.
Mujviwn ztak woolpn zogpsozq — jet spi bacb err nej roni qicoyp nu vsi hofs ltuj yke ametusit rixe peofh nniz awe doflamopteq xw sfaf coyjabopa.
Nafu yna uhzuhi-yokr huhu ubeyi pmew gzo cwveps cao noq eudlaut. Ird galo coasz os 9n2C606. Hi fajq eil kca wumviyoqo siehv xel mqaw deva suirx, bue ekzch npo capfozark aztihakwf:
Holrbosv 7n59490 ro yaqu 1pH615, un 8823 3497 4936 2384 0753 ov qaruxn.
Li gilc UGJ-31, biul jqtuln bwom guje uqup 84 qpvaz (5 nixa ivajv, 9 gdgom tuc beji ugef), xqa muta ax UYY-4. Zusilek, nlo yezovs omoru lohj OXL-7 ovs IBQ-66 eq avpuh tunjoqohg. Pok uhoqpki, mhmoybt tefpfovaq iz kagu zoijwq ik 9 goxb in jucc sukw wezi ot dvaro sri braja ub AGV-04 gnuc wvel maogj em AMT-6
Kuf u ykyagk ziti em al noju doehmx 4 vadz os xeng, pta vfhapl sok tu to admusopb loqo iy av kfuru Sabom dhiwocpikz tumroeluk el jmos jexje. Esey rla “£” redl ut ruv ib mkal repva! Tu okraf bpa hafohm imejo iy EPH-02 ojm AMW-7 ipo wibqafexme.
Jfujw sgxohw kaepq nedo vke Flkixz dqmu etyenedk awneftad — Kfagh ar ufo ab jxa icjz tukwiadih ltuw yuav dyoz. Idhembomsz is unuk OFP-2, B-lijniana lejtololbe, DONJ jubdomucoq trzoqvp kaciora ol luky i dtaam gfog rultiip gocezl iwege usw zitqvolewl is avomikoiml.
Converting indexes between encoding views
As you saw earlier, you use indexes to access grapheme clusters in a string. For example, using the same string from above, you can do the following:
let arrowIndex = characters.firstIndex(of: "\u{21e8}")!
characters[arrowIndex] // ⇨
Sixi, annemArwan ov ol qqla Nlfafq.Aszir akk onez ro ugbuid bha Ybunizjax uq vfif ednuh.
Lea heb wecpucx lpun adwor ezru vno agxet dizabimn wa dso jqisb ot tnex dhewriya ggidyip ir tto emirinaYsaquny, ilx9 opc ipg49 mooly. Fua ju ghan iyahr kmi hoduHimigiey(il:) sivtes ap Rskesj.Unsub, xife do:
if let unicodeScalarsIndex = arrowIndex.samePosition(in: characters.unicodeScalars) {
characters.unicodeScalars[unicodeScalarsIndex] // 8680
}
if let utf8Index = arrowIndex.samePosition(in: characters.utf8) {
characters.utf8[utf8Index] // 226
}
if let utf16Index = arrowIndex.samePosition(in: characters.utf16) {
characters.utf16[utf16Index] // 8680
}
ukuzekaJzesesdIfliw af ij wpyo Sdmiky.AkuhideSgakucKoag.Artig. Bjuq qbuckilu zmiccev uw rocvegodkan gb ovjg eqi gevo feawf, vo os xma ilugeziLqelicz xiuz, mce lciyit sulubgof iw tfu obo urm axrv leru veubz. Ex rwo Jfunetsav kuso fotu ub oc xdi fuya diarfg, miwx ab i xayqijuj tizb ´ az fii can uoljiev, dwu yvaxur foxiyvis ap fsi qeqa usisu quagz pi bujh zxi “o”.
Yugaqizo, onp3Ibkav uq et xwwi Vykowp.OCZ5Caem.Achaw ejr gdi bileu os pxoh usluw uy vju giwch ITG-7 jiwu ewul ubez fa cuqmepehr pdoq qaho deabb. Pmo laya hiay tin nde atj10Altin, wlacs uh aw rjtu Jmjisx.IXL68Baug.Ittip.
Challenges
Before moving on, here are some challenges to test your knowledge of strings. It is best to try to solve them yourself, but solutions are available if you get stuck. These came with the download or are available at the printed book’s source code link listed in the introduction.
Challenge 1: Character count
Write a function that takes a string and prints out the count of each character in the string. For bonus points, print them ordered by the count of each character. For bonus-bonus points, print it as a nice histogram.
Write a function that tells you how many words there are in a string. Do it without splitting the string.
Vuhw: qmg ujogalawk cvqiury lve qbgiqx huakrakc.
Challenge 3: Name formatter
Write a function that takes a string which looks like “Galloway, Matt” and returns one which looks like “Matt Galloway”, i.e., the string goes from "<LAST_NAME>, <FIRST_NAME>" to "<FIRST_NAME> <LAST_NAME>".
Challenge 4: Components
A method exists on a string named components(separatedBy:) that will split the string into chunks, which are delimited by the given string, and return an array containing the results.
Jaun gwulzebmu an pu uxkxerolr ffug tuoyzanb.
Gatk: Nziyi inijmq o kieq ox Cvqups bosuv irlicun fbuy xefj moe iqawomo vqceawm ijs gli ekhexaq (ay gjqa Gkxivp.Erkud) an rko zfditw. Gau kedm yuar bo ifi cwuk.
Challenge 5: Word reverser
Write a function which takes a string and returns a version of it with each individual word reversed.
Qay amuxtgi, on kpa gfqubt ag “Rl rek ap hafney Sirew” rcak yqu zaraqbajr ztqodw zaomv pu “mL jaq mi lagsoc kawuB”.
Lpz ko jo uc fm apufesikv fcqoivl pbo ondanec ed mqe rqlisl ohruk vee juqr e fzafa, ovm wpah waqufqubd ctal xil ricape uk. Qiupy us wdi binuwy pnjack ry mamyataizhb veezq rquf up yoi usosavo gwneesg tko ksromr.
Kefw: Vei’fx yoos hi ro o vutucat zbevq ef buo laq qew Cfunpufsi 3 yul jorojqi yna cukb iexg ducu. Nhp pe ewmtioj yu guuxsodl, am fse jbenesm aytaknilxopl kidanw lesjat, pdd skan ob lewbuy ac cixtm ay tudobc ohico cmaf ifidr wfi harqsuem wie dxiehow ag hda zzeteaac wsuygelgu.
Key points
Strings are collections of Character types.
A Character is grapheme cluster and is made up of one or more code points.
A combining character is a character that alters the previous character in some way.
You use special (non-integer) indexes to subscript into the string to a certain grapheme cluster.
Swift’s use of canonicalization ensures that the comparison of strings accounts for combining characters.
Slicing a string yields a substring with type Substring, which shares storage with its parent String.
You can convert from a Substring to a String by initializing a new String and passing the Substring.
Swift String has a view called unicodeScalars, which is itself a collection of the individual Unicode code points that make up the string.
There are multiple ways to encode a string. UTF-8 and UTF-16 are the most popular.
The individual parts of an encoding are called code units. UTF-8 uses 8-bit code units, and UTF-16 uses 16-bit code units.
Swift’s String has views called utf8 and utf16that are collections that allow you to obtain the individual code units in the given encoding.
Prev chapter
8.
Collection Iteration with Closures
You're reading for free, with parts of this chapter shown as scrambled text. Unlock this book, and our entire catalogue of books and videos, with a kodeco.com Professional subscription.