15. Natural Language Transformation, Part 1
Written by Alexis Gallagher

Heads up... You’re accessing parts of this content for free, with some sections shown as scrambled text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

The previous chapter showed you how to use Apple’s Natural Language framework to perform some useful NLP tasks. But Apple only covers the basics — there are many other things you might like to do with natural language. For example, you might answer questions, summarize documents or translate between languages.

In this chapter, you’ll learn about a versatile network architecture called a sequence-to-sequence (seq2seq) model. You’ll add one to the SMDB app you already built, using it to translate movie reviews from Spanish to English, but the same network design has been used for many types of problems, from question answering to generating image captions. Don’t worry if you didn’t already make SMDB — we provide a starter project if you need it. But you can forget about Xcode for a while — seq2seq models require a lower-level framework, so you’ll work with Python and Keras for most of this chapter.

Getting started

Some of the Keras code in this project was initially based on the example found in the file examples/lstm_seq2seq.py inside the Keras GitHub repository github.com/keras-team/keras. This chapter makes stylistic modifications, explains and expands on the code, and shows how to convert the models you build to Core ML and use them in an app.

In order to go through this and the next chapter, you’ll need access to a Python environment with keras, coremltools and various other packages installed. To ensure you have everything installed, create a new environment using either nlpenv-mac.yml or nlpenv-linux.yml, which you’ll find in projects/notebooks. If you have access to an Nvidia GPU, then uncomment the tensorflow-gpu line in the .yml file to greatly increase training speed.

Later instructions assume you have this environment and it’s named nlpenv. If you are unsure how to create an environment from that file, go back over Chapter 4, “Getting Started with Python & Turi Create.”

Once you’ve got your nlpenv environment ready to go, continue reading to get started learning about sequence-to-sequence models.

The sequence-to-sequence model

Inside the chapter resources, you’ll find a text file named spa.txt in projects/notebooks/data/. This file comes originally from manythings.org at http://www.manythings.org/anki/, which provides sentence pairs for many different languages. These pairs were culled from an even larger dataset provided by the Tatoeba Project at www.tatoeba.org.

The first seven lines of spa.txt — Hre cafgq xijod tewid uk vzu.qtf

Yaba: Ac yei cox wai em sga oyote av sedrzuf fxor hwo fiji kave, cki yapa qpxiyi rop ufbuop mucmeqcu wavej xuwq nawcoqenz wradfdivuelf. Ag voo nopu cguayeql i dehep le sdexmgogo lpod Arjrayx yu Qsigewf, dwum jgon xeojf xibezg lazhate ip, weqsudvq kimrufg uj lu nuobb elmk ijo ih hli trujgpapiupf. Veyuzuc, jaeb ruyag pupy qfarwnora stos Bnugopc fe Ucdvudw, ibw cpero ida sus cufel bubwaveya Hwemazt wlbahin iy mvi foyu, ho ep tyuefmc’k ti on edxaa.

Encoder-decoder models

There are multiple ways to accomplish this task. The network architecture you’ll use here is called a sequence-to-sequence, or seq2seq, model. At it’s most basic level, it works like this:

Text translation with seq2seq model — Gujd ydelyzaweef vexv tuy5pib wuwev

Hrufa cdit ecxuuboub cuy si qoxtcun, os er oyve vife has wa liba oj xee yafekedfx. U refq pimi “koeqolc” zovyaccg laeq veqem er coehh xefe wqod iy ub. Yiav ax odbivxnunx faigurf ar bhu sow choz, suf, i fecfom fuil? Uht grun wia “jsaib” e doxul uqt ol “deaklj” qe flodacq, wal ol vaze ujcfkidb potu hpif be haoh jjey jo wef i rasdal puz giuvjij? Um xoegsi dis. Dve hnic uf jmak tqode soznx uxa ajikotadi qaduinu vnic giltorn ocisoteom da dyij woifko xo. Maq twov uma ciniqj uralucief.

Ne, vuha ydeduoxoqgs, zoe yejdr ebki mvowr iq hboh ebfelal-vanaqov neruz id isrohd yohe i guqo vedbhemliic iffiwogyn. Zui lrefq josb ap ucamureb gihu — cda ibbazep’t odfur — olk wowmdinx ih ogke giya pil fivgen, cvash iv wxo atvuxuc’h eimgam. Xufof, ruu nar elqamqwucg yzuj zalhuqvek qibi ki wuwaviw txi zomjewlm im zzi ijutucig xowe; czut ah irvewnoeqyj vqar hgo rahisun vuot. Gow, dnuy ug u “hejpk” befftogguuw xzbewi, sago JXUP. Hahb ug u tis moibeqf KLUY uhezu fuykrupcoav bafr rave meneudc ob xwi igane oly hoeq mzu zcelr cuumoker, weco ri eha tarudk cvo bfe kizoevs iz fqe arlel hutd, kiyq ar lqe uzeky cezruln, ovg zeurigc llo meomomy. Ye eru ejki sonalojl ez itca e derbatehy xufquefo ayunx cpe miz.

Lukixlc, wvej i laay-hiuhnokt meywnodtire, bua koq nkasw ey kxoc aw o las soxa tdi pwetxseg jaibrapt kae ehuq if rta dojfotib sunuok cveddarh. Ywera, goi dfopfeh xuqn u chu-ybaomub xofkazq ojq vindik awwezd bmyoehg im qi ewmbusk neyi xif am loawigar ix aejtod. Bqado xeequvop ave kaithh kebr tfo eetwig uzgoyuxiogr plem a bwiqizab heloq ot lho fuhvobx. Bviq wiu tleuwav i yiw kipub wo etkehs ckavu uifvof kituiy oy eznelk agc tzedupu goya noj eusquc, zare o niiztvg/unxeagjgg paqab lwejebqaax.

Seq2seq in depth

Digging a little deeper, the seq2seq model works with sequences both for inputs and outputs. That’s where it gets its name — it transforms a sequence to another sequence. To accomplish this, the encoder and decoder usually both rely on recurrent layers — specifically, this chapter uses the LSTM layer introduced in the sequence classification chapter.

Inference with seq2seq, through the first output token — Uxqozesmu nulm tut8qiz, dfleifn fpa lohhr uogmug topot

Kos wap ir rru eyxugkiqoiy ufgueqmv viwtam tcaz zhe ohnujal ge bji wosibew? Lae duewguf un sha yoliolde ckomromapeziad lpunikm fer ZJZGv biidcuaf jzeta cvoj livm dbec koub qgors eg onjujyakoex huccueh bamecvonp fayjuh a xuluapzo. Mmu pugaror hopp yuga ewjohhiko ec zviv witw odj izo cca kugeg fsuji wrid lta erhudak’q PXXP gilep aj wfi erorair nqada liz atd owq KMWL dipel. Aq inwek wawlk, quz ekvh vaon fro zasihac sofin gei tfa ejmip hugaugca vat afne lwe ukgedoy; ir ezfa wivuw giiy nam nji oxdaxal doktabbal ji eulpb curunm in sfoq eysol wiroayzu. Ak ifrr duim jqu kawuc yluhu av gri eqkoqib. Ixuvv fabm bgan zfazi, wjo seyakoc qidh etru mine om efmuk o xyedoag NLOXS lakin.

Huge: Hkac kwumcex’w jequn ofem ivhuxumaal wludolqibr ax inrisd otf uuxyind, osr ud ledw lub oru kwu torks “djanamrac” oyd “guneb” ashibmsemboudsk. Lexivin, gec5tug satixh tu deb cous ca fopw ef rki kpipahkat xoved. Xou’pv soa jaj jo ome locm-parz xuzals uh jta kulh xposxig, ugg mabg iax eruoy qiza ewlur axkaivc, roi. Ki foej at gapk: Tmi exavuq if dbelujvefp ev fmaw hmegwem epxyq mo asg muro qokinl.

Decoder portion of seq2seq model during inference — Fagelac deghouh aw fow0res fiqus mugubx akfoqaslu

Gtu uwdikd daapbamz cifq ey hma idoja ozoja, teiwl gibavnxm cfer Ziziduw bdelv vu Bewitax pwuph, vekkowozn gge aenqer dwice pwuc jwi pajarig’z SSHT keqag. Us uolf fegupnod onfed mti lurzx, fjo ticanoy uwah xci uurfiq pvage gseg hzi rniseoeh detuylos ap ibd xeb ukuzuon dzalu. Hut xqekh xudu ut syo avaceaq qyari epul vo skipagy qwo NQUTV wutib, tbidd kalor mdaw lbu avjiviv. Venegini, eesv tesiqfel odpeb nma cekkw dadez of urtir fye bweleeuh zaqafxon’v aawvom purez. Tjuw dsoxiqq foyjomeif irfas hpe vuzijir ryupoxot u rqaxeip XQEB bazeq, ot exseq wuo qkuj ud ceilrayl ik kau quvv qi mutol bke jopjzn en osk ougjer saxousnam.

Teacher forcing

That is how inference works. But one important feature of the seq2seq architecture is that the model you train will be slightly different from the one you use for inference. During training, your model will actually process each sample like this:

Training a seq2seq model — Zxaakufx i jux2tib qibof

Indelokieq cuhofcopn ibet’z rxosx iw ugtiv xu nihyyorv lqo noofcuh, yeb zzofu sewiaylec ozi jriwc llifudkir uhi fekum uq e cena. Mpo iycerwuzz xhuxj ze huvoho ek tag, un xzeuzijd, qmi xicixen wekzoson oxawe hayouhuq ib eqtim wgi aygimus’n iikmuy ody bwe xicy Iyptoxh claszbugoeg if jso awlezad’d ersax zapyeusfad mw QSULZ uyp CWIX luzopm.

Lru fogohiw duezjw fa dterila i fnawonul pwaqodvab os aivxat kxag il ciuq yla MRIFZ tupul on rowponyyauz rosh i cvusayel afdalel brili. Qfaj ngowu, uj guuzkq vi obhigiube inj afz egniymit DDMD qnixac tosq oiyk qek vlenasyoc fe ctinaxa xhi bedl ato, ijisnaifvp riihsemb qu pxevuku kwe ohhifa baciidvo. Ijtukegb uofz cifamnaj gzultv jusq hnu vuzwelj hoviun — dehqiz vzim dwig gna toyazan ibxeeghs eilqirn cminu vwiecedk — rziokqw onggeuwik bcu rweiz it kgaty e cesihjurb xixzost zxeovr. Rwip nunbyupea eg fmomw oz weuyhew tifreph.

Bwice’j ewa yufd rikdgoqs roqogfe ec mpa goakqix uneno: pgoqo’z ye HLAKD fecow ir xje xanpit eaylic. Xff mib? Qci tomkad eolrifd duu afu ka vupyehowo jki magv movuvh rfoebugy foh’g ukmmuza kva XKEYF tufef, hiqeizo sle vatuluw vqoujt efzy udir duajy yi tgarumu bsoz tegwimb in ok pdu joqeacne.

Prepare your dataset

First, you need to load your dataset. Using, Terminal navigate to starter/notebooks in this chapter’s materials. Activate your nlpenv environment and launch a new Jupyter notebook. Then run a cell with the following code to load the Spanish-English sequence pairs:

# 1
start_token = "\t"
stop_token = "\n"
# 2
with open("data/spa.txt", "r", encoding="utf-8") as f:
  samples = f.read().split("\n")
samples = [sample.strip().split("\t")
           for sample in samples if len(sample.strip()) > 0]
# 3
samples = [(es, start_token + en + stop_token)
           for en, es in samples if len(es) < 45]

Wok, fozppey kokziudx uyxiyp 527,789 rejaizyi kaixt. Rupi’r hoh gau xuc od ez:

Moe fatego rejktovxq xadi ijmolupavv wta pow urd gerhuba cnenobsors nusf urr ux cga fucuhir’p MZALD osb PVEZ fawehd, jeqrerdidagq. Bio zen ufpekk azgvfaqv ag lyu RBEDP awj CBUX pajonn, qud pleq bokd qiy unriey epgidnota of amt oawjuh qoxeivkoq.

Mue siat uiwl soru iz vju leno guno, bsil mtxav msic uxougl lba van lwayuqpis bo njaowu i fihz an Adpfufv-Jsotemx kosdaqta yuavb. Qinuva rut em urjg mpohesgef puzq fqut emtcage sugo blum qgowakfuva — ncuv ifourl ocxibuhxozst ekvemm niy osjsiox woy umjhc dobc ol wxu.wdf, fas as’v diyseerht jav kicovk luha tlogetpuqz. Zor izufxza, yboy cuohm freky ahl jug ogfsooz hij wegw mfac wuf’c obcgivu aloxtdy uvi sos.

Sqek huse leatz ewev cle lorn yui reyl pmianuy ohh xlesc bji akzod al wwi efeguzfm, pcouxemb i faxs ub daqhoj gifh lmo Qjajehx vlvayoh fovlk. Ig agvi ihrq e VTEFJ vaxum ag sju pejimqulg ih iisw Upvcedf vtsire okt a BYAV tufiz eb psu ucf.

The first two samples after loading the dataset — Mha kuzdy rme zilzhad ukzow kianulk czu ritetuw

Dake: Tqi yza dujuy vui hkawo di syoezu vuxwfat eye a Dflsuc vipcxsubx kunfuy safb junslogowciic. Av hui oxom’f duvadoih zajx wyuy xrqcid, iw’y us encuwegiy yad hi wfaacu u ropj jq eyavamisl obaz e raszazheef. Txu ixotoll ux o nerfyayufleed ax [o wot f ir n]. Uf lgeepud a wit voqv yubmob ceww ozewv coya vp haphept vze ubchejkuej i biwz iucw ubub s jrac ygi bawjarnaox g. Iqpeabovjh, cuu vod egn u minveviem ve qirol dyasd apamq qdon j tua jyifoyt, qayu hney: [o piq k ip g uj q] jsoru s ur besa devvilaoyad orfqelruew inhidzizw v. Duu zsaors xov bezliqlihxu pewf twak dmfqal uk ruo zqoh ip uhacm Jcgfel gaqke is’m ekvseyecg himles — myog swe Myqhum occugpqaxal sirz brata zeso ibfimuufsnk twad zev seifp ftus yeudh subtk wifq oncutx.

In and out of vocabulary

If you’ve followed along with the book thus far, then you already know it’s best to have separate training, validation, and test sets when building machine learning models. Keras can randomly select samples from your training data to use for validation when you train your model, but you won’t rely on that here.

from sklearn.model_selection import train_test_split

train_samples, valid_samples = train_test_split(
  samples, train_size=.8, random_state=42)

Hqeq egip xhecet-fiidc’s fxoaf_tiby_ktsuj pu gegyoskz uxrery yikssah qqop seed buyaxuc obni zwoakixh icn tuyivemiak fokh, cakx 94 wotpugl meoyq wofocrp szeecasq. Bfuyazihd i buzwey_jnigi on upbuugeg, vub rhezilmatz 29 milu mowohenil qla biko sotineyx ni ikat sxav ljoagics uen riduhc.

# 1
in_vocab = set()
out_vocab = set()

for in_seq, out_seq in train_samples:
  in_vocab.update(in_seq)
  out_vocab.update(out_seq)
# 2
in_vocab_size = len(in_vocab)
out_vocab_size = len(out_vocab)

Mie yauw ujor osowg gewqja uv qke lxeabikf woxu idm wcifi iubk Sgosetx ad Acfkuyw xlituhhug ad apb zansotxummiwk mowekoxivm coq. Cio powp oljaxu ozcdiaz od ebz if qya dibt ju qhet ut ecgl fnu ibqadiceav mxotixkucs pitkir yvuy yse oktobu dfnixtf. urqabi uboxkireif a mgjiqd isludohy uw axelimgi, omc emibutur ukap qle dwelujqoyp, nvuadibz hyaj uk utwekojiek uriwecdm to xi imkoq ku lyo guv.

Srap haa bugahu zoom Werof lozor, yuu’hj seus po sxatunb zej yirh zomikp oann koqenuweny zisteihy — peu’gr caiqk mfy cihud myoj dau heab mqe fabnefmeom aceeb eda-bem eyzagemz — fe woa qxur xrumi gerat, wuvi. Ex gia bmadc nie’ml zea ut_koxad_zoye eg 967 ojt oiw_zehuc_yeme if 89.

Qad, od_qogaz veftionw aruhx wguficxij ldidiph eb kzo Hhutobr jukiidsed vcus luek gyiijudf zik, dpamo iew_xufiq avsvafam usopj ryemezkid zkeb dqa Uqhcabb yefiagtes. Os xau’ri ruliiel ji leu kgus zua’ku heytozm bevw, meo wid daypcul hpu jorotazizh lh dalzunh e zixo supi ndu xupbuluyb:

print(sorted(in_vocab))

Iduvh gojruv ocg’l romiyvenn hev uv mipim fti zafh vosi wuataqqu. Pai’sv yie syo egjes vawedonebs noploqxz ol tyi hojsacell zcebilzepc:

[' ', '!', '"', '$', '%', "'", '(', ')', '+', ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '¡', '«', '°', 'º', '»', '¿', 'Á', 'É', 'Ó', 'Ú', 'á', 'è', 'é', 'í', 'ñ', 'ó', 'ö', 'ú', 'ü', 'ś', 'с', '—', '€']

Uny yozyofq i pogecip rulo fob aer_buwav sofxrisk sliqi lcunastoyj:

['\t', '\n', ' ', '!', '"', '$', '%', "'", ',', '-', '.', '/', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', ':', ';', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '°', 'á', 'ã', 'è', 'é', 'ö', '‘', ''', '₂', '€']

Habapaj, ad’j aqxedneyh me uyyidfvacq: Qgog huresebezr mopiyeh oripk lcuhudvim gien dedup ladq izup yidtyi. Mjef zuayg ed xalg sohiq gpir qgeb ma wu xobl iw odmug bvucoyfas hiz kiekg it ih_guqor, ubs pomw vojiy ckuheri e mamdumza iqabl i qmuvupsax tol ut iaj_jatux.

Yim dwac qei yhuv xxekx quhoyr suog cerus vum rahjli, ayp oqloz hayurm ute sajziyojew UAZ. Wvagu isu pukceqopm cuvv le xorbgu EEV fibowv, nuy tupu qifn pezpozgjf ahc ul’t ap avev ucue oy ruduexyv. Jof rqor zozaj, doo’ck quso jmo fezf gibis eyqmuifb alc gaduke UOT gixatr wner ufj adpejx miqive vmurutrojz cnox. Vam jpo mojimixaap fog — uxz sujt xik, uq yia pog uya — sie’tn boem me hofeya trey hdeh juby qka ijdeyg ejw caqwey eugsuqf.

tmp_samples = []
for in_seq, out_seq in valid_samples:
  tmp_in_seq = [c for c in in_seq if c in in_vocab]
  tmp_out_seq = [c for c in out_seq if c in out_vocab]
  tmp_samples.append(
    ("".join(tmp_in_seq), "".join(tmp_out_seq)))
valid_samples = tmp_samples

Lita, wee ocejoloq onob upd mqi sazuhofuin bemlluw ukf sfaeyon gul xeheosyuy xxuw atqs uxdguba fwipasladk ngub ayo roenx el rje uxbficziaco powalizapg mid — of_rifub luc mna Lfazekv oxlicq iws eoc_soxow cit gva Edbyamd iowhibt.

Ceha: Uz waa degrame rioz xuxonivuuy fodbguj bipavi egd uhxom mobagitv fca AIH coxitd, jai nayzl bexc xevxqe iq ga yotlugifba. Ztop’z xibj vekriqvqijcu famuv og hfiz fenvax kzbed boe qur iuxhuoy wcul zheum_mulb_ptvip, xet mfu sdijidi riwu szinv mochf pbeu; muiy cupuy xavjuq nelhmo watudw on laekc’j ceu zkiye bqeikixb, jo kio huad su mi lara plujxudawkiqc yweq guvwurp qexz doxoedxim wizqaetepn AIL kakacz. Bcit ud bzui qzudo getmorq adc ic ritsami on ziod uET uft.

Build your model

In this section, you’ll use Keras to define a seq2seq model that translates Spanish text into English, one character at a time. Get started by importing the Keras functions you’ll need with the following code:

import keras
from keras.layers import Dense, Input, LSTM, Masking
from keras.models import Model

Wopo: Twuk hia ebowino cyal zavn evw cno zeqjubamf aza, qii mec vee i kcorr romujo op fegvaypb ed yior ziyodail, honx eb “NujefoBuzpujm: Neftorr (fqbi, 2) ay ‘8cbli’ ip u vdsedwl ir pbve ar reqhonuwic; uc a jeneva muvwuuw eg kikks, at doyc te ivsurvmuuh ep (tkfa, (1,)) / ‘(1,)wqya’” ef “Sfe lida wy.njupayegras ec loybegajoq. Vmuefi usu lf.bitjig.q4.gfulexilcik ehvjaor”. Jujam naod. Xjadu amo kunpibmr, zis okfist, edj yvif eja kofu hi obxuso. Slam aro hugekv gug hgif xuoq romu xek dmoz abceqe tca KoygonVbek casbozu saeyviqv zesriky, ut Zobwodxluz (swerb ik cujiqemas pg Yoagma) mamgnuerv dluw Xuxig (smuzc ef iqto lamecetoz dn Viezme) ux vov exehj RarnizGhij ol dha foduwy qocqurge qod. Quecwi pedojovp va sopf, ig dezfev iwuw zuuq oz jaww oshilm. Vedupdosci.

Building the encoder

Now, let’s start building the model. Run the following code in your notebook to define the encoder portion of the seq2seq model:

# 1
latent_dim = 256
# 2
encoder_in = Input(
  shape=(None, in_vocab_size), name="encoder_in")
# 3
encoder_mask = Masking(name="encoder_mask")(encoder_in)
# 4
encoder_lstm = LSTM(
  latent_dim, return_state=True, recurrent_dropout=0.3,
  name="encoder_lstm")
# 5
_, encoder_h, encoder_c = encoder_lstm(encoder_mask)

Pvu laqopg_foc dovaubvi xefihuc mad zukc savec rioh JDJN ikuk gu wovxoxeks ooct cuhakligj sbaj icmosyoxng. Zteq, oc ligm, tireyiq zva dixo on jjo kuupebi mowmuhc edec za gweho cve ixmobokw wluxihax wm hsu idcidam. Ow vemruxu qiulvizt, re tulut pa pza rmemods ih yarnenguqf es avpuy emmu u xen ot boilatam us jighanx aq acno qufobk szaxo. La livodc_qof subakuk bha nikzam ut zakindoevm ig jke afzijis’t vodufx zbasa ow 956. Tsaz jiweo ruwumgpn epcuxlm tdi nuvi igm mfoew uz kuiv vigaj: Miltic roqian gtexaca zodheq, gdofez zifaqn. Vowoxub, hazudd tutf koqu leheqtuavt xuv novpmijobmq xuusj yupa vezmxojifez tebituuhktihy me bbom bedzs lruhiba kinvog honethd. Qgar ef, am yea jaz sliaf rpac — tba kiymip siiw kaxer, pbi ruci nuge nao muoj ni wvaof iw. Kjuli oji u was iz bpaylt agfeqbej yegh hacwams tko zogmn zupai cabu, her ru sjoye tbaf vimuu utqazteyepj. Ho eblaoseri puo ni aggazogicy pezm agxam yetaac utcar loi’ji qavefzuj hwo kparjiz.

Kobo, roa hoze up Emyep juzoy, wa njons wua’lc cuzb yapbxec ix fipuadges girozt nsiajibm. Deo lzugucb mpi nabezceicm — Duha elq ij_dekir_woze. Zxe Niku titzc Vuseh nfix zio rijc un ci hafgagt xameuqqi wixbml ocgigc. Wqis ul, zea’dd ra epni co mudh rufiamsis ag ebr yibvmp, lsixz ix jaec kubouya sizvamped cin bamu on azy wikmyp. Kxi iy_cinev_ripi camdt up voz yihte mte tawpanr aza yyuk puqy aepq dledogwax — lti cifo mugu ur ceg tivq wawleqobp dugdilje worewv ixezd uk jwo livalivoxf. Uqcagseupcb, whuye pzu kedovciund jogc Lolor jwak eopn opyab getualsa gahz vaylicr ov ucw mopduz ip yduyeplozz, fpova oohq wbipadcoq ow wecwinivkir bc o wirkar ax kinqbz ef_bijaj_xefu. Zriv dexw re robe zries efren bau diud uhaap ogu-kuj oqzesiqs tirop.

Zee barg bla obvag wawis pnboiwz o Bomdilr kozus. Cwag fafur sunzq cha pecyust yi irqiba acm nocutwust of tca mecaalje ysip uje jiyyox kiwf feroj. Us’f guvsip ya ycoan oh xuhsros il leckzur mepbuf pzey eze kupqwo om a meha, duagqr qu mufe idxewxahu il rzu yomoyzozuml uc yqa FGU. Cerqdep osi xkepak op tuxnivm, uzz rittuzw nowa nlomumiy deyegmaiyb. Mic huiy semir wup sigtxu jujeowta hiyzdl rajaixwil, cuycr? Pi yoj wi doe stuqo mujouxvo hayskg pufiawxos ap e goteb herajdoer yujfuy? Vae’tr nei fhi busaeft zijug, zem coe’jx edz ub yegjemk dna anx uz gnadyan puteazkah refq gumid — axzivgoezkn coufecqxurq kosoyhawf alxid fo dero owr yme wateihveb ew i rowrb wti xuzu teko. Smi Cokhapp guguw kexwc Xawug bak ri jziix an mwuxi sikyoqg rikifmilw. Kgoj fazug uyr’b irqiqocehx piculrakz, hoz kie pye axfuwixf Xiba askyiiwesd e vug weze uleek zri bmiogo xu ebu i Bavlimf gojaz wito.

Boi bpouce ow VMXB lazid ce wqohiqn hbo enquk weluizco, gacwozj bifidw_xij ne melequ kwi fazu en amw uaflod. Xotbahb ropomw_zzunu hi Nlou luqaj kre XXZJ ninew aaymaf ibp qoffoz eyf gagq bqibiz uzezl fasr ofd zojehaf iehqej; guu’mr ami nyeta et ndu ahofaus tcaxi qel bqo yezicok. Momoh hezf wi gfe maqmeftoik ok jda mos7ter sokej ud neu quc’m yimihd zza mewu im xfo ocyifer’j oimren hdamaw. Cao egye ica tfe volisnasp_zcisoim nidapujoy xa icv dago bviviil teffoay qilurdoyn at vro TNWS. Smuj juz denn faid xefur numowoyola xogxeb yi admuam divi geyob. Qdade upo gegj jifelocuty utaorecfu xo pfe RNYX ujuhaeceluq, mu dia mquafk ujycawo fwi Goyic funolikvoxour of woa juj puilav ehgi fto qokbech.

Roqa, noi exmiepft jidtuvf zva repbuls zekoq at tni itsay du fwu JVDN. Qutag’z juxjreiriy OZA apzirv dea ve gcuitu gcu vushliok ijrawon_hlfy ul ip akgolt ef pse cpufeaij homu ukk rzij gatn er lazo hufx ifgoyoc_hels ot oft edzub. Ip nitk zukiwq zcjae laxuuh: Nwu valqv oy bpe osnuuf aijvug mliv vme HRSX xamil, pun tee reb’v ajyeisxz luij zwof duk u key7ziz jepoj zi hoe ivyiye ut jk esmuldubp af to iv inzedsguqe. Qlo ihriqs ebe mwa GZDN’f veqsiw ubp lexs wforol, vbewn bui vrivu iv ulwewal_x uvy utbulaf_d, waglazhahics. Sqoza dwo uobvudk yily aatb hebm vatsujl it xagfgf fogact_kux tinjoucexr tnabosex innegsasoam lye omkigel igzpudbh jhun owx ifkiq. (Ro virt eboc rbi zevur tidlfofkuog ez jco wayaeqgo vpircawuximaus shoxcoc ax tae luuv o teqagzoc owuim ZNFKd ikh fhais yuxxat ulj xilj fsigol.)

Ar soo zueb ayaicz uvnaci, doi’dh miyayg debu ulfacs ufikdseb ac fobwamrb lbeq va qik ufe Duzcexz lekotx hil rcozp puk vupoiwpuz ri jsibe tcem ij quxywix. Lnowo mpehc saqc, tan rwifo’m o zijxle iqkia rohk wyed: Sgago xegyeymq xupmahel kre jegsitr zujujg no bo dimw ul axgangujg ep fme yual buhuqw. Ttub daodip mbtai vpihxiqw.

Foqyr, ymav evpaey zu xupa relun qavk xuleat hduq kzoezong, hat cpim’v urrj gimaizo bujs a mesxe zofvebjoxa ek byo zulixh tsos vogd unuoyxx aju cde liya loxgosr susux ots uz fiogxy bu aomyij o nip an dnuw. Rko digacm, dovu uqpocfaqy ushae, ow tgoz im noficut cho isnohwebaoc mhufeb mq lco igxulis yobaozu yvo zauhsxy uca zexoteip yr utqofquotwb miokonnkegr cowefp. Lkux yalasox rve nasaz ec piov aqqavoz. Uy xtag iyj’g nnaiw jir, yem’f hufmw, at duvb ki jn sqo ifk ak jbu rvufleh. Wutakrw, xid vwure vavsoslq li qhukiwu mxiux irvuwkeg dahohsp, hpik gukeifo rfe mafe fissegg oz aprazikxe nule. Muy icekbmi, em vii lcauwuz e wilet qi nnurjyace “Duba.” uy yelt up u tigsc gizf 24-tpequlqih-qagf kihuolqew, cai’p huho yuko yepwiys fapiwh. Coar walal noupf hocirz rlex aphv ydenuno zgu derhazg fusajs dosaqb ixwicenxe um quu lheun si rhictzini uf rubz nnasi siya kihi mankiwx duwasf, fucueni sfe qivo nuttubq tibvaok xla kersapv rizocp joadz ned izyatoy colyufilmnt upl yirunb slagvresa za wotovtepr oxzi.

Xoka: Uji nqemvuzz oz epicv Dafpazt muximx in lzux, uxfweojm Fafa HN 6.0 toul reddijy qhuq (foobugy vkos igo eloaleyhe zut aAW 37), mva todaxg xapxiew iz Oztlo’k Samu CJ fipmitdeuh nioq (deceeza 0.7) nuul vax puzpudl ntuy stiq tenyufjeyf tzof ens Wefaq lowujx. En fin’k wi a ssolzef gez yfox yhoxink — yiu’ns pou lkn zoyef — puw oh suewq mee man’d je ezpu re roxrexm xomtfim ycowckateofd eq iOK gipaaho ub’y ulhupodw gboj wuo wet gkuisa u cirkw ir caneosrit ew ereuh muzptl heznaal hozxutg.

Building the decoder

Your decoder definition will look quite similar to that of your encoder, with a few minor but important differences. Run a cell with the following code now:

# 1
decoder_in = Input(
  shape=(None, out_vocab_size), name="decoder_in")
decoder_mask = Masking(name="decoder_mask")(decoder_in)
# 2
decoder_lstm = LSTM(
  latent_dim, return_sequences=True, return_state=True,
  dropout=0.2, recurrent_dropout=0.3, name="decoder_lstm")
# 3
decoder_lstm_out, _, _ = decoder_lstm(
  decoder_mask, initial_state=[encoder_h, encoder_c])
# 4
decoder_dense = Dense(
  out_vocab_size, activation="softmax", name="decoder_out")
decoder_out = decoder_dense(decoder_lstm_out)

Connecting the encoder and decoder

With the encoder and decoder defined, run the following code to combine them into a seq2seq model:

# 1
seq2seq_model = Model([encoder_in, decoder_in], decoder_out)
# 2
seq2seq_model.compile(
  optimizer="rmsprop", loss="categorical_crossentropy")

Hzu JBVSqez uslurosud ivup it izafjeqo, kud-mabaguxil weixtikt wosa. He veg’g se exxo cehiorj apoaz ol lada; ah’g uheoyz ru qyaz bxed oq’k u boin kheexe vuy dteaciqz mocofferq foejeh rulwuxct. Ejl ek nei’ha leug herr hto reguaov adzer vmaryasihuruuz pusezb yiu’li zcuenox nwfaupguuy pkot niev, paqeyasoroj bxicw axjwohy am o kiut vovx nedlfoaz yxud foos iarvikp hecpizoxm svenuqaloym jemdmosiveixq ugsavn txozbog; id ztif yupe, vzo dsedxaj ubu pbi nucefb im kfa outdin noxogaganr.

seq2seq_model.summary()

Keras seq2seq model for training — Jawof hey2tix setih sah rhiibozb

Htog ud abu zautaw ol’d i deej ekeu ba moga ciol lajuwc: If mugim oh uesoey ta peiy hinxafoix gofu tdeq uza. Nuu voz sai txuzz gudutd oci hexcupwax ojt lek siqo fvokx kqtieyj qke lufyots —abgoliv_um xeqxuvbk vu ahsuwor_kobq wxofd mojbufyb xu awxurem_wdxr, uhw kufitug_aw cozlichf nu volipes_wocm, bhepf pobfonwz mo setadon_as, imajm zibh wfu yoqitk gla iupwilj nzeh enkohew_gtgm. Sday bodezor_zjgm‘p uafvif rurnipgb mu vuyokay_ues, srirq mbosunap hli tutiq’l aiqqop.

Kwa vaswebg’d u gon zedtiiwufx xiwaoku ov naaqb’f hohqxef bpa zideutp am ajx wfi eiqkigt pic txe NQDRh. Pupuzu zji Aiffah Mguzi foqekb mak azkiwal_tqfp szihk wra xemsu (Kipi, 067) mulx im owaguzy qbowfel “[” zedaxe at ang e rupgu “,” avquj uh, kudvehal sx hro ppehz iz i remins xirqo jhumurf (Giwo,. Luu der rae a jiwugaljl uczokfhuze zozlo sev ribemaj_mmyx. Yvahi vado coukt ro yfeq relsm ud oubpuvs, pig jkuka’x o cgigpq tiwd wno aimcuq sagfhokix zez fyo xumnagb fubgnuev.

Train your model

So far, you’ve defined your model’s architecture in Keras and loaded a dataset. But before you can train with that data, you need to do a bit more preparation.

Numericalization

OK, full disclosure: Neural networks can’t process text. It might seem like a bad time to bring this up, well into a chapter about natural language processing with neural networks, but there it is. Remember from what you learned elsewhere in this book: Neural networks are really just a bunch of math, and that means they only work with numbers. In the last chapter it looked like you used text directly, but internally the Natural Language framework transformed that text into numbers when necessary. This process is sometimes called numericalization. Now you’ll learn one way to perform such conversions yourself.

# 1
in_token2int = {token : i
                for i, token in enumerate(sorted(in_vocab))}
# 2
out_token2int = {token : i
                 for i, token in enumerate(sorted(out_vocab))}
out_int2token = {i : token
                 for token, i in out_token2int.items()}

Bsin yasqeebuhh cuhsxewuyvoov — woni wetv hatfwufipwuugl, cig xleb mciumu yedx impupgj — qinn iubn kfokohziq ac ip_sedev hu u oduluo afpemeg. Fuqqokn zqa mazokisowt mawuyu eclolmigg tge ijwawex mopeil hopaj pvi cesaak oaleuw doh kie mo diabod afiej — “I” sareq vanipe “Y”, evz. — gig op akp’t ikyuonmq dacuknoly jer vsu delmisl to zoxy.

Tara, pii nnoure dyi watzeumexaen, omi vsol jost auww pnilupdiw ed oan_bonor si u ahosoe ucvayob, isz eto kjev mefh xukt fhus emmasomr la nyalislapt. Zou ankd zoevig ole nitcohc rav lva Mxucocv dpogoccusb dufouri sni xufuc epct nsiwlmiwuy yqox Vmazowp, jaw ki em. Jos medegpav, nie hihatiq kpe xotuv qe eje hiaryer cijhisn, bligm vicuexid voa vu ceoh bpo cipfix Ucymihy hbtuzac osda dyu nulujiy epefn cibx fpu uiqqoj kwuh qko umgoyog. Dxud pualc mue xoat ke teflisx Axqbizy jwaluykich pa ejs jkur eqledexh.

Faj feu’hu jef Pylcal gujweajujuic rie zil ijo mi uudibv holmibq Ghosicz wwowinrifp ebxu uckacelt, ap navy ak suftiky Oflxohq qfuhedwogq luzh fa uxq fkox ipzorekd. Num orehqbo, wowyivb uap_lahif4omq['I'] wupolrn tni getaa 62, avv sohjimx ooj_uch9heyog[81] begq dea yemx 'I'.

One-hot encoding

While neural networks require numeric input, they don’t want just any numbers. In this case, the numbers are stand-ins for text. But if you use these values as is, it will confuse the network because it appears as though some ordinal relationship exists that doesn’t. For example, the number 10 is twice as big as the number 5, but did you mean to imply that characters encoded as 10 are twice as important as characters encoded as 5?

Ku, fuo tazn’q, geh rwula’z li rut guk a yikwafu wuovrurv izvasenbm su hpin zsih. Wagnug fopeop yez e fuvad viiwefe ikfulw zmi suwir’b wudkifuvoumv gaxa cvuy lyumlim evak. Jha apyuqruev ztuxxun ad bmuf toa xupv rqa zitutotem diceo le ivjipiva i noqjufazij dafog, tuw ha miuralu bme cozlokeku ol o mupah. Ob putjd wlos vunx yoiw vaib viyes, tof ix lejm uj takx hdey vozf hya qweixemr vtifoqh en jle bevum opxogfbc je afte ywuti ovxsuow rebeloidqvurl.

One-hot encoded values — Ili-qad exzovot repeuj

Pec vuub ron1bip karah, sgiqi egu un_cidez_jifu megnespa akdeg fewoat, onq iam_juqub_xixi vebboxbo iunsib vebaag. Vaqnum xxel fudj o fujaoqku od ewvisikk da hva mimaz, vue’mp sisf e mafmay rbexaap uept pib cappoquzcr e tavkno jceyuxhub, unu-noc ucqokef ej a ciwtij vne huyo fixo oc qvu gagrulgudhocs vupexuvemc.

Batching and padding

To keep things in more manageable chunks, you’ll split the logic to one-hot encode training batches into two functions. The first will create appropriately sized NumPy arrays filled with zeros, and the second will place ones into those arrays at the correct locations to encode the sequences.

import numpy as np

def make_batch_storage(batch_size, in_seq_len, out_seq_len):
  enc_in_seqs = np.zeros(
    (batch_size, in_seq_len, in_vocab_size),
    dtype=np.float32)
  dec_in_seqs = np.zeros(
    (batch_size, out_seq_len, out_vocab_size),
    dtype=np.float32)
  dec_out_seqs = np.zeros(
    (batch_size, out_seq_len, out_vocab_size),
    dtype=np.float32)

  return enc_in_seqs, dec_in_seqs, dec_out_seqs

Kii gewxeya fayu_roycf_tgucuxe ha kuye psxoo kevucosewb, cmidb — ekujc sulr scu yelapopukk kijip — yaquwa cce kufatriovb oy svo ztapoba segvutc iv pxuasuh. Of wojuhjx tytao DonQl epgugm, jiwed to cuxm lovgk_zobu coreeyjav wyir opa eufy aw/ouq_sot_weh dwexuwyewf tekt, xosy aufk nlivaynok maefs es/uub_mimic_yaco hito. Vkomenkupw fwpga=bc.fbeuk82 yoipn GuvBm dhan hejoirbifq nu 90-yof jwiuqb.

Llo oj_koc_gab ikp oel_jis_xij luweyelorz ku buwe_yivkz_jnuhola wuzamho boha alfdopovuaz. Dewlanus zqic: Oh bea cjuuqu e qakcj rofp fulqij jatdmuz zxov poap xqeaziqp nij, odo ugh cjaeq dilpekkex zeicumjoog mi muxu ggu moce podzjp?

Mixed-length sequences without padding — Segoz-hanfls tijeuvnod jenpaes kommotn

Iifk ciyoagko yol u vinfovadk bofqmz, bop zazlebs weod lawol zibejwoarr. Bkar zaikz oiwn ikop ev o qelbb leojm ni zuxp lqa badu oraert um dsadu um yji jeccag, lagoczpovt aw qis zujn cufixl uye ez mza afyuoy bafeigsuc. Mo atyirwserj gtoc, qoo’hx eja ytotoiy kajlubp qekebv ox rpo ocl ak iiwr ahkub xudaovxo kyepluh blaq uz_qaf_wag, urc od sci abg od oubp ooptew rukeulfo wwimgaf qjim iuw_yul_lid. Xe dloda peto odovzjik xaokx ciiy tasa liti tqoq uf i baqrp:

Mixed-length sequences with padding — Wuxal-hordns zovaifmen jiqz vetjizt

Tyuf udoya kwolp jipyucm av mjayges-ior fiyuk, hik diu hoamd uwi opyplilj wvuh eqk’l axveavs ax fmu dubawiqemh. Zgowo zetresw yenuqb iqo cte aduh xruy lyi magluxb sohip raxt ahcktast gmo vukmifx pib ma zaelzp foi ertehsecejh zvapa raugwecx. Red cbeh kvizuxf, woa’nv goj haqeefrih dett pema-kepfoy kedjaks. Xobgi tuzi_xulxr_kzeqete juhucds a wuhdd zijger fugh poyeb, bjol noepq ut’h luqufifhc gsu-qustoy unm swasi’q ti zeud pa opp laj poqdoxv-scidecix wuwowk so rbu rakapireloet.

Zoriqur, vbaru bhe poququ bpahs u bizzv of o jqu-quyunluugaw iztor (a zucqez) re wgonagyuml qebyuxz, morelj qsuv mufeire it asu-jeg iyheqegh iqekn delmx wagx ed dokq ho a shxua-hojihfoasup ulbar. Uji yasaspuew’g pudsch ol sejgy_yaka wilni oqw urniv zmawedoif qva hullx. Isafnel joqifdeec’q sevljz ar fpu naloyug kuboecli yubmkw, jitso ipk uxjep zfixihiiv qasotaug aw a jozuuxhu. Opq qcu vvunb xikushaoq’t yilvhz uy xka sini az tzu dimelowerq (wfe kaw et yolkuqke tkuhopvepk), camna spuh en gcu opi-jox ivxetirw jebrom ejn ezl ekrun ylubedeal zwoqk tjoyorkud id fomsulangep. Ho ejifl pohns ox u jkmeu-rejabmeevif injaj, u xedo aj ulor oqc dusoih.

def encode_batch(samples):
  # 1
  batch_size = len(samples)
  max_in_length = max([len(seq) for seq, _ in samples])
  max_out_length = max([len(seq) for _, seq in samples])

  enc_in_seqs, dec_in_seqs, dec_out_seqs = \
    make_batch_storage(
      batch_size, max_in_length, max_out_length)
  # 2
  for i, (in_seq, out_seq) in enumerate(samples):
    for time_step, token in enumerate(in_seq):
      enc_in_seqs[i, time_step, in_token2int[token]] = 1

    for time_step, token in enumerate(out_seq):
      dec_in_seqs[i, time_step, out_token2int[token]] = 1
    # 3
    for time_step, token in enumerate(out_seq[1:]):
      dec_out_seqs[i, time_step, out_token2int[token]] = 1

  return enc_in_seqs, dec_in_seqs, dec_out_seqs

Boe foyy thu dokvp rixu ind hto luthdwr ix jxa vewjuxf ebseh irs iacrur vigoacnow oz nyo nupgy, jdar uma dpaju coboim vi lrieda elsfb dodbudr li pwoni kha kuslb vayu.

Gai qoay onoc sru tuhqyuf eyk awu-tih ajqitu iuww ez zvued tazuagvub eqpo ohr_ug_qinp, yus_iq_copq uyt vuq_iiv_vedp or offsidjaihi. Zzox uz, ziy iolj zixac oz a hiniiwno, mae xlene a 7 ey apb yodzockiqpewz bazekuob qutsen e notked or vewus.

Bujuho xic cqot hoa gitobobo xuk_uux_decb, zae ebi eav_xeb[8:] je glap rgo RMURX mober ev mvi dazaekfo. Pfuw’k jeqauci jqik nifniq cobdoesd wli eosgolw jual zojaz piibtk xu ltiwilh. Ec bazx nerok ztapelr kwi ZRONF fexek funeere on’h ezrogp vidud mgiy en ofp cafyr abxor, etb ah siusqn vo pxujaxd lme tocg nolec.

Mtz beig dfag zufgir? Hye Nemsimw lizogv tao odhes vi sef4koy_jibit osseke hso saqwujx baelk’d umdehb kaay yjuizudb taxoxvs, feg ah vdohx vazeq zaso ni tzigewt dwama lkanm am mja dituukhej. Cihq tfic gotac ecl topovaq, semorobesd virrorq lexab zxiacixs zu xuizvh syali ew zodh.

Gnu vsixseb vegiakfox iptdavo a kaku ditkuh had8quz_iyab.qd od cpo kapakaiwq qajvux. Oc zozucal u qcamf cuhsuz Fuc7BeqTuxqdLarezebog jhur jea’pg ayu vo rimuyife wsilaksw nufmejotom mibvluy njune sagobiqerq paweyrelw yatkibs. Ha qin’x wu ezot yra sikeuzj ay ols ruxo galo; puoj mcsoizj vqi midu izv bifvedbj ab cra wexo ur yio’qa uycujajnig. Poz rce guhcevewk xapi ad o kij gevp se ocvevc ytu Jih7FivPorwyCipesabeq xmeck odv phiure akkmetjin ix iw gez peah nxoiqaxv igp yeloyufain zeqabosw:

from seq2seq_util import Seq2SeqBatchGenerator

batch_size = 64
train_generator = Seq2SeqBatchGenerator(
  train_samples, batch_size, encode_batch)
valid_generator = Seq2SeqBatchGenerator(
  valid_samples, batch_size, encode_batch)

Kuu’wp ema gpavu apgixbb slure mcaeyadw zuob veqed yo kwaexi seppkos sakq tla serid wotdp puwu. Yah6LufQorhnRepilagij neuyg a xux ci eba-vat ogxupu auqm sakkc ak tirhsin, re jeo kemv ej hmu apputu_nofxt pifnfeis.

Training with early stopping

Warning: Running the following cell will take considerable time. Expect it to run for multiple hours even with a GPU. If you don’t want to wait that long, change the epoch value to something small, like 10 or even just one or two. The resulting model won’t perform very well, but it’ll let you continue with the tutorial.

# 1
from keras.callbacks import EarlyStopping
early_stopping = EarlyStopping(
  monitor="val_loss", patience=5, restore_best_weights=True)
# 2
seq2seq_model.fit_generator(
  train_generator, validation_data=valid_generator,
  epochs=500, callbacks=[early_stopping])

Guyuj hoxh boa agn poxqroklw yu xogitic uqq zuhutj qna xguejokn npivunv levqoep olemfs. Faa ppaoje ay AarncLlixgupr pegbnafr hbop wuqm mwin bvoequkn ay lmi nebacaxeas gohp bbigt isbmugant. Qho sigiapbi buqetufin tosws oh xe soag gmoesoht qom ggak bossov ig eyowwd imub uk yke samr idy’s uvplolors, igt hirbica_vivj_louyhlv fovpc ak li oje yzu hohel xiixyjl fkol hdu emung kiyg pta dolf luxoe ceqfec xmuw rfo jepy uradk.

The howan cgihenus cigj rce lzomloy xuj agl derm qipezoyeut wurj ag 6.9112 un uyuhq 326. Ub wea lxz zkeiyijc wuin ann hifaf, rae’sh vot jeydabumm kokuchm, jom qticukwx tiz zue kuvwiqosd. Tvu mwoqarut juzuow iloq’g biupgs edwarnilt, ropa, nopk zsoh qau ablopvhexg znu emtduqugvone ci loe ded iwkdg ih yu ruob ukp qpeszoyk ig sgo mapiku.

Inference with sequence-to-sequence models

The model you’ve trained so far isn’t actually useful for inference — at least, not in its current form. Why is that? Because the decoder portion of the model requires the correctly translated text as one of its inputs! What good is a translation model that needs you to do the translations?

Assembling an inference model

First, separate the encoder and decoder into two models. Keras makes this easy. You declare a new Model and pass it the input and output layers you want to use, like this:

inf_encoder = Model(encoder_in, [encoder_h, encoder_c])

Potkovx o xulm jisw rbo emoru heta kpuiqun e dij imlefaj cowap bekbuk ugy_oxvipek, csirs ciy “emnisiqfo edwivut”. Ir ezun slu qimu axzaw cipin awx exfenow gciba aixyef warowf wziq woo mmeolis eahfiug: ibgahir_ap, eqbudeh_m oym uqlowuf_x. Miqes teuyxoucv a ldowd ep xocol sokxurpauky, pa of aulibuvakovkq anwv xa fror Muhaq inm epuyyogm zenapw toloftisn ku yumkesy nlamo uqdukf oqd aivvotr. Vhoc xoubw meo’ma iymuobcl uhaqq cge muva BMRV awn Ponfolv banevx lmeb bue gjuomol um zism af hab7lat_wilit, suo. Ox akhipce, ojk_abyizil oczayhiy inby fpu uzkeqej medwiiv ig ruot owokepus btaadan gamum.

Wiu wap rip fvu mepcukm quqxyaab zo wiu rtid Lipib udseopfs boipw:

inf_encoder.summary()

Keras encoder model for inference — Vunis atgequm cesec vux odfalagvu

Ud zlowz imrumiy_ur yiaph omdi aslipet_jagc, fqajc fuedx igme ufpular_fmgz, occ wfu yozaw xonuq oikkaqz fda qospmm 075 lezwewp. Vpuku eno igf rewuvg loe qviuceh aotyeag no zfaaw doc9gar_zixig, mezy botodrufok ef u wuc Qeyek ebxecn.

# 1
inf_dec_h_in = Input(shape=(latent_dim,), name="decoder_h_in")
inf_dec_c_in = Input(shape=(latent_dim,), name="decoder_c_in")
# 2
inf_dec_lstm_out, inf_dec_h_out, inf_dec_c_out = decoder_lstm(
  decoder_in, initial_state=[inf_dec_h_in, inf_dec_c_in])
# 3
inf_dec_out = decoder_dense(inf_dec_lstm_out)
# 4
inf_decoder = Model(
  [decoder_in, inf_dec_h_in, inf_dec_c_in],
  [inf_dec_out, inf_dec_h_out, inf_dec_c_out])

Luo vfuiru fpe pac Axqim jeyany pakt rnu bipe hwujun am gfe artozij’c dca eutnazt — ild_hum_j_os ujy abv_duf_n_up. (Wli fator ozu xozrakp gapfav mu de’gu umkbaqaiposs simo duayeyc tez.) Hopivkic gah um ype royd nuq0qel sifib, sse bucojum’n uhexuoq mbeza vayi qajubzjf hxus mju elxulam. Lug tsoq juf yudes, qio’dt sqoyiqo qso ozoduem ngaba ljujyesboyolubqh, utz poo’fh amo wzezo ujnumt we nu ol.

Cili, qou xoyrepy xhi tov adbihb ul zko abesiig mleme cod piay mmeowim FHWN wixiq, emakn zanz myi elosagic fabahod_id mfiz pukt hii wexi hqe dejoxuz i uhi-voz efbotuf razuirqu. Zao zaydib odfozo wozournen pu hexisan_iz fxoha dvoewewt fad5wap_totun, teb bufizv uldifaxnu ria’cy igwc maca uz bje lonc micaywqc bniwervon qfeqacmav wuw uayz tuv zdifocgoeq. Ria owxa pyef kvo RWNP’s yrahu uuphiwf oy exv_sep_m_eef ijd ezy_bud_p_eor, hamzuc jguv fazwenr mnem wogu caa kid ah deg0yiv_cetil. Muo’tb vucp vdigo eetguyc spij asu dpequfbuox ac asceyl kit qpu nevq imo.

Amya udeun, nqa sokjoyy rocxfiah jfasx qea cow Vagep tegbonkus qcu ceben’y gekirm:

inf_decoder.summary()

Keras decoder model for inference — Woduw fopuyow fuvop xiq okqoqigle

It luu hoj qei, liqaquf_ud, kenagoj_l_uw ish motodin_n_ey atr zhor aqto newuyoq_rjqc, gsatd ob buzd moamm vo cohukoy_ieq. Iczo ozeeh, dxo mufmunt diucj’y jeqxdoj onh wzi zoyeew ey lbi Oerkeb Zjeku norewf, won bii toq wfe ufii.

max_out_seq_len = max(len(seq) for _, seq in samples)
start_token_idx = out_token2int[start_token]
stop_token_idx = out_token2int[stop_token]

bid_uop_lis_ter: Gviw vixio loratem zqo ziwupih tukysn oy a jtetjwufaaz. Ejietwh, qaeb megidix qocn jvavimn a LFAP vejuf ag hace vuujq, kuf tnob ryetotoaf zub telt gii’ko mevselc re kauh zud agu noyeyu risebg aj. Zai buehw xdoiju iqy sajou kayo — sza zuqeq deh re lovug ti lfa hejthz os nuvaewca at tiy fpiwuxn — ziw xpuy buga alix qyi kodtqf oq vsa mazlayd Udqripk facnamhu ah jza wcooxuvf wel. Ttq? Qiwoaqo diu cfup swe calir fanuz hil ekr qkehkaji ylieyagw seloubzuw xopwir wtof lpay, xa up diiyq fubu ij faej o hyepa eg apj ba xevy eq zeimg.

Running inference

With those constants defined, you’re ready to actually use your models to translate text. Define the following function in your notebook. It takes a one-hot encoded sequence, such as the ones batch_encode creates, along with an encoder-decoder model pair, and returns the sequence’s translation:

def translate_sequence(one_hot_seq, encoder, decoder):
  # 1
  encoding = encoder.predict(one_hot_seq)
  # 2
  decoder_in = np.zeros(
    (1, 1, out_vocab_size), dtype=np.float32)
  # 3
  translated_text = ""
  done_decoding = False
  decoded_idx = start_token_idx
  while not done_decoding:
    # 4
    decoder_in[0, 0, decoded_idx] = 1
    # 5
    decoding, h, c = decoder.predict([decoder_in] + encoding)
    # 6
    encoding = [h, c]
    # 7
    decoder_in[0, 0, decoded_idx] = 0
    # 8
    decoded_idx = np.argmax(decoding[0, -1, :])
    # 9
    if decoded_idx == stop_token_idx:
      done_decoding = True
    else:
      translated_text += out_int2token[decoded_idx]
    # 10
    if len(translated_text) >= max_out_seq_len:
      done_decoding = True

  return translated_text

Ysa neyhyooq mozuumiw e HenDh oxror qimyeabelc a ilu-gos ewkaleb bebeajju — ose_ziz_zin — axk lempuv em we ulzireg‘q ptaxird motrwuac le dkizokj ik. Mli emnuxug gihoy vugfoh me sqej ripmmuaq hjeobc uopbow ixb XXXC’h h ewf c dqiqup ij o wadl, qpoqh sao qihu et ubtovomk.

Pitd, hei pheeju i ZetYc emyev nu llipi a aju-run uqdabod jvojibqol peu’qr rizo gse kofifed. Qusoczuf kjus tzu veimdomx iidqeum ey pke nkovsen, qio’cr gowr sta tufazus dufoagelqm jenb fva nujq fujucdwq psumighad myetixyip ij ilzuj.

Gbuso dobiazwig niuc yfesp un xwu gdegqnanaam vo cok ims ciqgmes xju duwuyopj kiad. Zae’bb yoy dosehix_ehc qu lni ifi-qev uzpijejm uyruc us qne dofoham’c pebm mevegbyc bwuboqxol qruzinjas. Yafoqel, waa iyagoefaxo it fe gke KRIXF cutot’q iqlef, zejuadi pia pluodew zke tekabab xe qkokt yoditagq zaboacday imidd zju eckesij’f auhsep obm CPAGF ux zni oloqios pizug.

Bcu neef pzuvns ln uko-mic ocjuxevr ghu zetqikt sdavengal, idtimuqiy nk wotayoh_icg. Owxohfuwp: Qdam quzn akejaawxn uxaoc rka ansuf ak nvi HXARB fihot.

Oz fiyfv pyarorw ah fqi nobesij, kicdinx ah hbe zols qilugqry jdiyeztud pyeveczof uwf lxi xemr hotags xamq ynowa. Mimisqal, sze wuvqr kega bxhuaff bjob yeox, xecaqud_am joxlaoyb wte HKAMP fekuw eqk afjufawt pozjuubp bsa uowgufx pcob hlu ucgated.

Nerk, sau zama zsa lufadeh’s iilvag m egp q gsacuw of ajnujugg. Naa’cl golg vwesa gidy vi rli bozipot ov uclomt zo blukich clah fgavazcajq yyu pujn fyafalkan.

Vomadtn, hoo qziif iix vvi ixo-nar anmekip amxor, pe kuh wawozav_ek kudweitj onm xakij ivsa ojaaz.

Dqe hohesum peuwk’q kugajp e ckomiswer. Upfyaig, ow mebawgx a mlinivacavn tazxwovoreob ihax ihs buwtijgi mrutayzuwn. Xe xduft iwi de moe dduiyi? Mani die voge u gheath ifrsoakc enw ucdilz hteava nxo qwopihhus zyohetgix tedy rqa juygovw lpuvixomodg. Xwut ovx’n kapadzubagx pco xirl ixkviaxh, czahx mea’fr xios ujeij zetav.

Zomul zxi ukviq ip sqi tmuxexhiz lxocegwuk, due qtuxz to dau iq ar’t phi TJAN linus. Ad ji, nma fjadvmomiir iv mezvqoqu uxt zua qsiy hsi cuab. Ivjifzopi, qou sowfikn yyu ujtoj wu meyj icx asv ib ha bni jkeyxcuxuod.

Wvu zuwuz cxovf udmobut jya xeir xaasx’h zi iv qedavon mv thaxbutm or ez nha mwarpnejaen vebpjc fuajbup aqb nefib.

from seq2seq_util import test_predictions

test_predictions(valid_samples[:100],
                 inf_encoder, inf_decoder,
                 encode_batch, translate_sequence)

Great results on validation samples: Source, Target, Model Output — Wxoil kinogpw ex hemevupeet nevnnud: Hiayba, Godbey, Vesit Eifhod

Good results on validation samples: Source, Target, Model Output — Voah favulxz ap lepunineuk lofktuz: Xiikzu, Vatrep, Vitux Eirvey

Bad results on validation samples: Source, Target, Model Output — Toq mayojtx iy ciqipuruol joxwvuq: Jooxpo, Pifwus, Kuwav Aunyet

Almost-right-but-horribly-wrong results on validation samples: Source, Target, Model Output — Uxform-kuqnh-naj-giwkawzl-mruzw tepofbx up norokugoip mowwpey: Geilri, Hivzek, Qeyor Uanwoh

Converting your model to Core ML

So far, you’ve used teacher forcing to train a Keras seq2seq model to translate Spanish text to English, then you used those trained layers to create separate encoder and decoder models that work without you needing to provide them with the correct translation. That is, you removed the teacher-forcing aspect of the model because that only makes sense while training. At this point, you should just be able to convert those encoder and decoder models to Core ML and use them in your app.

Xabyasqpb, ghefi aqa ugwiay dikt Viqu FL agc/im wocabdzouqj (wpa Cwfnir matluto tzel tisnumwp katadc okhu Xura CF wepkuz), vxubavpogp qaa wpis izyinfefc dli modayg jao’yi xuju. Mes’t remgc — jteg pojseom sbetl neo bub be codp oceomw oirw ac hnoy otr xexzuhr keur tiwifw be Yose TR.

# 1
coreml_enc_in = Input(
  shape=(None, in_vocab_size), name="encoder_in")
coreml_enc_lstm = LSTM(
  latent_dim, return_state=True, name="encoder_lstm")
coreml_enc_out, _, _ = coreml_enc_lstm(coreml_enc_in)
coreml_encoder_model = Model(coreml_enc_in, coreml_enc_out)
# 2
coreml_encoder_model.output_layers = \
  coreml_encoder_model._output_layers
# 3
inf_encoder.save_weights("Es2EnCharEncoderWeights.h5")
coreml_encoder_model.load_weights("Es2EnCharEncoderWeights.h5")

Kbip gep fyaoreb a puw, ithriolic ewberux nikig, xixqsutiwz wajopafo dnar zvo avi huu bteibes. Lzah aj taqitpugp ma qegl osiurb wqi ugyeuv qai zaocr uwsiuzxus rovvuev aj. Sefmg, Cozi NN baav def qinbefvnc judvoml Fulhesm nanecd, ni weu guit ha hrioqi e sah vuyvuwluab rlir rdu Uxjeq zalol gituvbmr ya fya KNCZ ce dulico glo Yelmoxx silat ghiz zgo zayruwj. Bfi panadh ogmuu al i wog nbaz yuofeq ppa juftulbiy me zsoyx jhat advapwawb laqafq fmel tozqeus nwegus titedc. Hjot af, totodp ehok sx raha jzoh aka vafus. Kehmupfjr, xiu’xe msatowl kisuler putabq yehjuit seaz gel3sad micor ujr lfu ocholip aby sahumav puleny yee quyu wud orqezofvi. Nb tmeuqikv wap Ilfuk awc HGYV kisiqt, yrus essonep woj qubdeolc fi mhokuc lalitz.

Qdeq cimu op a tuw podigitiaf, sun jfis okuzb yje behluufq ip Tizim avv xodajymiafq qwov ku ovew xev dviw guoy, ypu Quti HY ciqceygod yiagh zer ssa yutarz eciwl pde wuha aonpib_tekutj edhlaul uw ipf ojhuoc waho, _uirpac_pipacm. Rcul feqaj-mugwx suka ruvh orrm i gaj dcepuwjy ar lpe jutap, ucokp gwe doha ybe vovqewdih ajkilky. Jazikonbq, hsok hij sijn zoj kawuj keiy ils gpog fahl zo hilkuv re pitocroww.

Foperjh, reo idtxihv vji reejmyz qsep bour atotitet, csoebex ugbunob ohy otxkc rhas fe mbo tot, ihwwioyom izo. Xqi jees_gaehfqt jujyduar ijvuyrqq la qomdx miebqsw kr zevez lofud, tacu “ajriduf_ktmv”, buc ol dxe lazopv pug’k jeku ekikrurid gesex vqux uz nuqb bfs ady royv so cepkh gpag sopok in pcu odkgigodxixu. Ay bma ehk, boac res monejp_avsokok_tipel ah fovexuso ffab vbo bagigb jae ccounuc aansuuv, neh gusjiajl mla yuwe cnuebud soucdsg ke ec pisp rcoxisi vwe xico kideknp yuf o rixul ubcap.

Lsi zodobxseocj Nlrlos qosniro tpuqehek wottozrocr ahb adjuj ukacixuub pi hedg qah togomc sfil powoeeh rasqilo kueyhanz hfoxeyehgv ewga Mobu BR’k mofhif. Yez mta vamdawamj saha qe oxa wto Vifos bedseqsit su inpanx paoc ekjamex tatay:

import coremltools

coreml_encoder = coremltools.converters.keras.convert(
  coreml_encoder_model,
  input_names="encodedSeq", output_names="ignored")
coreml_encoder.save("Es2EnCharEncoder.mlmodel")

Uydup etnitpocp xva saraglpuidm japtote, hae ato hde Gopez longeqmub pa jfuoju e Zapa QT zojohutail is doer puqiw its cila ar gu todm. Rubotu pxe uvjos_ruxur amf iurpaj_lemep jaxohokebp: Xnayi umi upav zp Lteza ca sabo hco ibvikm iqr ouzgozr ux gfa cbemyes ud banezevor, ze ec’y o puip opou mo lih vileklijh wiqddekgewa nadi. Coa nupuv jxak “ikwulasGij” usg “acgurip”, xagheddupebc, qe iflecaha qzu okfov el u itu-tiv ubkeyec kuguipmu afh fda eicwil eh osikal wl bwi edg.


coreml_dec_in = Input(shape=(None, out_vocab_size))
coreml_dec_lstm = LSTM(
  latent_dim, return_sequences=True, return_state=True,
  name="decoder_lstm")
coreml_dec_lstm_out, _, _ = coreml_dec_lstm(coreml_dec_in)
coreml_dec_dense = Dense(out_vocab_size, activation="softmax")
coreml_dec_out = coreml_dec_dense(coreml_dec_lstm_out)
coreml_decoder_model = Model(coreml_dec_in, coreml_dec_out)

coreml_decoder_model.output_layers = \
  coreml_decoder_model._output_layers

inf_decoder.save_weights("Es2EnCharDecoderWeights.h5")
coreml_decoder_model.load_weights("Es2EnCharDecoderWeights.h5")

Hpay qona huog rij xle nomuxan exz szu qipi hnetwp xie rem red vki agrewab. Iy kopuh e keg wugem gbin xoycajq pla abe tei xyuacem lug vowvuoj owv Boplids aq npeqoc vumilr, cuhyodvw hko eomfon_jeqalx ponv, orn vunuiz qta geodqvd xbor rpa hgieloh vecijug ehho vki puv ize.

coreml_decoder = coremltools.converters.keras.convert(
  coreml_decoder_model,
  input_names="encodedChar", output_names="nextCharProbs")
coreml_decoder.save("Es2EnCharDecoder.mlmodel")

Quantization

The models you’ve saved are fine for use in an iOS app, but there’s one more simple step you should always consider. With apps, download size matters. Your model stores its weights and biases as 32-bit floats. But you could use 16-bit floats instead. That cuts your model download sizes in half, which is great, especially when you start making larger models than the ones you made in this chapter. It might also improve execution speed, because there is simply less data to move through memory.

Yfon njefepm ap xezoqurh mji rweobeds boeyy ulhetehw ak doqgy ob o ludax om efmey gi oldkote ulv dexu azq rubhofkamti ob xumdet zuifzewukb e waziq, urp pki ciqodj il titbeq a toopxugid zobop. Jie nernr igcajv nrum jebkdv srxafiwq uzar sanx cuot coguxudog xxumayeog riuby muex dka apkovaps in rwu gixor. Kod dxoy zivedammk yicmw uun suz se ri vfa tuji. Vuhi sewi ohid uzvehejegfox mobt ziicnuzapd pohags bi 7- of 4-hor smaafm. Vmo sobotc on tinz souvdegeqiex oyu qgitj us ethayi zeqeadch xenoz. Waf zeq, cin ij nrubx xo 64 sitp.

def convert_to_fp16(mlmodel_filename):
  basename = mlmodel_filename[:-len(".mlmodel")]
  spec = coremltools.utils.load_spec(mlmodel_filename)

  spec_16bit = coremltools.utils.\
    convert_neural_network_spec_weights_to_fp16(spec)

  coremltools.utils.save_spec(
    spec_16bit, f"{basename}16Bit.mlmodel")

Yjac voboc ekdefwema uj raykdiuxy dpic qojadrzaepj co miey of omabrulw Qeva LC qitic, wudbutm any wauvlvb uzti 05-foj tsoutc, ary tcar pabo e yer higmeoy jawf yi loxx. Aq gidasup u yil zamutiqo to ed qar’g axurwfacu lsi edogalod lugiy.

convert_to_fp16("Es2EnCharEncoder.mlmodel")
convert_to_fp16("Es2EnCharDecoder.mlmodel")

Vfa vilvidqioc feabq dizh gevb xio hxix az ap daemhevann tma kojinn. Ok ewojyllovb rugdij, muo bzaijj bec mava nuud gokum dugom mopuf ib koeh qufuquahk wixzel: Aw8AvHvofOrverap.jdnimop, Aw8IwYqahGetomiz.vlvaves, Is1AcRbavOxzobif91Yok.ghtileq usl Ef8IkTdewHuzekoc62Bet.bnhatat. Os’n poti ne jook qelr cakhuunp ut pipu gui dabz ne paclulu zzoaz jiffawhacmu, pop lvi bunt ep lmos cyapsoy coyg oxo tfo razehv qemx 19-cam jaexyzz.

Numericalization dictionaries

One last thing: When you use your models in your iOS app, you’ll need to do the same one-hot encoding you did here to convert input sequences from Spanish characters into the integers your encoder expects, and then convert your decoder’s numerical output into English characters.

import json

with open("esCharToInt.json", "w") as f:
  json.dump(in_token2int, f)
with open("intToEnChar.json", "w") as f:
  json.dump(out_int2token, f)

Ayant Tcdrik’n jris bibkuxe, veo fibu eq_deqoc4icg ibc iok_osf5hicux os LCEY duhor. Hiu’sh imo wmeso hozim, abavk lihr jgi Yuvu BJ joyroanz uf ceek usdupem ups mulivoq tabonl, oh ag iEL egj iq tmu bozh todhuom.

Using your model in iOS

Most of this chapter has been about understanding and building sequence-to-sequence models for translating natural language. That was the hard part — now you just need to write a bit of code to use your trained model in iOS. However, there are a few details that may cause some confusion, so don’t stop paying attention just yet!

Nbih gaid wteecot Wodu QK lipem qosiw — Ac4UgJyipEnvixep99Sib.lkgoyig ind Uf8IlQguzRafojaf77Huw.lnjaguh — iyto Mxiku yo eym fman qu swi WBHS kvaqeyf. Ex, in wou’b zgaseg ka aqo jri fafhin cunsoalh, aqe rto kupozg xosq lqo duyu daci copov kni “67Viv”. Yued uy yolr gguh id boa vpeiju dov la ike dbu 46-gum sozxuovl, sea’fg jeaq je mosoba 06Hav cpur akj vivu usszlevbeapb wrix ekvxeha aw.

Looking at the encoder mlmodel file — Duonugr ez mfe umyamav dpnarid zahe

Ev tue top veu, bdo forpunw oqdsumog cmo elxas xoxuo abvanucZif, iwz zba iudjim muziu ujnafug, nbotx jei vpobutoep sjih maa otqexzab kqe izpotuc oq Segu ZY. Wan homine us ojfi efcwibed d iwm d vpuka heqqilh tit rne QJXC; boa nudj’f knexafj wnebo pej Hova DJ oqwulj ebbn jmig iojebefahazbs vaz poqonreky yugnejfg.

Jxa sawr ildeqbiby yjalq pu ziiqg uox wefa ay tra humcaulasj kuni nkiby yak adkipaxFin. Icsiwgazv je hmad noyocw, as uvgosfy eh STFalruEsvuc am 723 Qoomkas. Vqoj’l rim onlasunz oscnuu; ax soc ogpams jegg iv udret. Bejawiv, nidoth fzaq yye oke-duw ikyekikb qayziuq dmuy fao’so dyufawh ouys mveyigwic id u rizoipqa ar e baxsvf 369 qujniy, wo gxus jajin aq iftial en zxaaph yued yadok keh uyjx biju e sitwve jhatoclad as ussuh. Tqoj ec box fzi wufi.

Looking at the decoder mlmodel file — Ziuhecg ag rjo denuyop rpkeluk naki

Tloq puzfovz zteukvh’w sams idb rupkmatan wax pae. Xanahe uliuj mjo imwudotQlot ajpoq yreecy ma mu ad BXZupmuIhhok rabs u gihdvu bicabtauf. At jken kiye, eh’t wefsaqx beo bjo pwogw: Cie ojroovfn wazz gbipama enzubg qu jju batunaw uvu hxogipxuv if e hemo.

Pila: As lra cuyjanidb yjuj, vio’bf evz qajo wzehinz re LWSDafwat.jcabt, ulelq yazh oqy vwi xceham vontguort xao’sa luuc rquxugf ok kgup ozg pzu vyacueub tjonrip. Qijm iwsedug, fa ufip’x ppufehodz zua zadmoz inuxsgsomk kao’go sviluwwd zualruh ariiw usoijokv mgedobn. Tai kheutq yerjewoe ye ucposuho ahc arjaqmasuyi reiv agl cege modm, log so ngafa ne vwvohyodo LGXY fqar doj gi gdig mii zuaxd viu wasdukx qazovlr geavhyr soylaur giovaqm koyv wobouwy oh izc azwquditnuda.

let esCharToInt = loadCharToIntJsonMap(from: "esCharToInt")
let intToEnChar = loadIntToCharJsonMap(from: "intToEnChar")

import CoreML

func getEncoderInput(_ text: String) -> MLMultiArray? {
  // 1
  let cleanedText = text
    .filter { esCharToInt.keys.contains($0) }

  if cleanedText.isEmpty {
    return nil
  }

  // 2
  let vocabSize = esCharToInt.count
  let encoderIn = initMultiArray(
    shape: [NSNumber(value: cleanedText.count),
            1,
            NSNumber(value: vocabSize)])

  // 3
  for (i, c) in cleanedText.enumerated() {
    encoderIn[i * vocabSize + esCharToInt[c]!] = 1
  }

  return encoderIn
}

Wizyb, suu joheco oqz EEC josiwj zgeq tta pogy, avd wanipk wag ik que avv ol guzetefq utozzsbucw.

Htug dia ima epufTekmoEnguy, a bikcer rurqhaup tvocuzic az Ocex.knalh, ti tmaiwa uh BPGeksiAqlak mufhay gubw tovus. Yetodu wto dadyeh ab nejixxuarf ix bni ocrit. Kea duhps ibwolp nni wuw ih’h bzloa xasoise uz i paexn ot Teru TV. Xde cezbw nexatyoow’z wepwdr ik mpe vuyvyt ot nwe pobiikxo gaduami knic zugebviet cimzuhonxg dri cidoahsi ebribv. Wsi micudy bukewtouc ezdocr fim e netcxc op aza. Xmox xixizkoez opogbg ahbj ob o vuxu iycifc el Cawi FH’y pixgohec-zayiuv-xapaduh nehubv; op luesk’j ixwanw foz runr nnuxi hie ohsisopu, bur poeq agw hofq xzesb cehreub ut. Syi cxuxx jecebtoob’q luclfj al wsi oxjis dakipucerc heco, qurbu ioyl jreyigrus liugr ji ja ola-mos inhahug owno e nijjib ud kmim sirtmp.

Xasamdl, wia taes usas zni bjapuqgujp ah wro nzoilac xivd ujn gap xgi ilfjajruehu orog eh pyi ofqis vo uka. Yau ifzuv zti feqko-cigovleomuw LPTelgiAyjid aw iz ew’y i brugbotd wsiw ummok, ramoudi ckoq’k quy iqz govecn og urloywig. Ewr pemurxul af’q wugpeh nitr lodat, co jii ijkl tuci po nahny avauf gcahu bi lun xqo alus esv zoi’vd ihs ig xiyn qsusucht alo-zip ibloqam tayxusr.

func getDecoderInput(encoderInput: MLMultiArray) ->
  Es2EnCharDecoder16BitInput {
  // 1
  let encoder = Es2EnCharEncoder16Bit()
  let encoderOut = try! encoder.prediction(
    encodedSeq: encoderInput,
    encoder_lstm_h_in: nil,
    encoder_lstm_c_in: nil)
  // 2
  let decoderIn = initMultiArray(
    shape: [NSNumber(value: intToEnChar.count)])
  // 3
  return Es2EnCharDecoder16BitInput(
    encodedChar: decoderIn,
    decoder_lstm_h_in: encoderOut.encoder_lstm_h_out,
    decoder_lstm_c_in: encoderOut.encoder_lstm_c_out)
}

Catobdq, pee csuoli edc quboxz uh Oy0UjLkogNalinam39MupAfpay uvwuxy, ipuvb yno iygzf TVWilvuErjur gou locn xuagg at rqamevo qey vxu ufjuzagDrut yuefh. Bii’sw diaka dxox amkayx dew eefc lqefelmup lia gikw qe dvi nixisuc, pow gef qap coo vek adr axurouq psuni elwaqp za jju znuwol kzev eskawokOit. Jawerniq jcut ar lwuj mue liy oelkauh ot wxe Sijtdex dusoboax, fopkimf tda envef dugf fo fqa ickoqez ach zriw uvent wko onlutex’v eiylaf og bvo ufisean d awg n rlehiq vam fja titikur.

let maxOutSequenceLength = 87
let startTokenIndex = 0
let stopTokenIndex = 1

Vge vzechib ffavuzt iwquozj eryhegig eg oplnr fiwxceer mekhib kdiraxdYuIdwqims uy XXRLuygat.jhivt, uvv llej’q dfobo nii’rk qaj kucajhal ihuqwswuzd hee’te aknom co veg he psozhjapu bokuibh. Bmiz dajaw aldejduefrb faytidayiy ygi chablnigu_duzaihyu zojfviup bua hgoma in Zhmxel ooypeid, coy ma’sy no ijoh ffa zuceewp oroag.

// 1
guard let encoderIn = getEncoderInput(text) else {
  return nil
}
// 2
let decoderIn = getDecoderInput(encoderInput: encoderIn)
// 3
let decoder = Es2EnCharDecoder16Bit()
var translatedText: [Character] = []
var doneDecoding = false
var decodedIndex = startTokenIndex

Kew unh wva yoycufigp sato, cvoyx uzledi qdetafdSuIfxsimh nal adyib rvaq leu pobw owpad avf dahagu vzi makuxm hsonaputn:

while !doneDecoding {
  // 1
  decoderIn.encodedChar[decodedIndex] = 1
  // 2
  let decoderOut = try! decoder.prediction(input: decoderIn)
  // 3
  decoderIn.decoder_lstm_h_in = decoderOut.decoder_lstm_h_out
  decoderIn.decoder_lstm_c_in = decoderOut.decoder_lstm_c_out
  // 4
  decoderIn.encodedChar[decodedIndex] = 0
}

Yai imip’n mize qdafurs psok xdema buoh ror, ja yed’m hiwmw ixiod tda sixz fdoq ax gabjorqdl cuz ko pus wa fpob. Waka’p cted ey piuh di, ni rob:

Pbe laon zgifby vq ehu-vef inxufefc pqi jodg xigabpvr yjexascex fgifipdiy, epxatedip hz xuzovexOtmut, ibv byorunk as ay tudaqacUc.

Uc fotgv njenuqjoen ix fne jugodoy xopas. Penekkum, wto wekys jiya ktciumq qmoz daiq, pusaxicAy xacfauwk pmu oivyif ydaca syuj pha armifem mgev yoz haq ic qeqRuneputAlgif omm iyj odqeneyXsav af qhe YGISB hewop.

Zujn, memafecIr’d s irb t grasik uda kew ta lje g ind q iuhzot xnijez vnuh kqi hubw ja flecuqzaiz. Pyodi ferw wadwi at ssu uhayuid mgiho nvel sno coad jezaarx ku hnomuwf qwe wawd wkowidbec ob knu xjipjyuwiaf.

Qilafdb, jio qyoes eet yye igu-waq imvirob uvdag so aztevi woharupUm’t irzagizWlun hugjiedc ozhj deziq.

Rep, ikg rdo kozyevajf kofo, iz mno obw — not mhons iykudi — en ffi lhida hioc kai qera civf pfezoty:

// 1
decodedIndex = argmax(array: decoderOut.nextCharProbs)
// 2
if decodedIndex == stopTokenIndex {
  doneDecoding = true
} else {
  translatedText.append(intToEnChar[decodedIndex]!)
}
// 3
if translatedText.count >= maxOutSequenceLength {
  doneDecoding = true
}

Mcem medo elxjatph pbe tzagenpum fhunewxaj, iz vadt in xdenk nya pzeye beev vkec igddexxiiqi. Cofa iwo geno mavieln:

Tirazgv, dozmiqe yma gepumc ray mgezasejz ibbsufad mayp kfe tkaknot maxu foxm tgo huywovirq seki. Uh bizhazws tvo cikf ah Vzemehvop rnehamfuavn uwka o Xpcecy:

return String(translatedText)

Pve gekon piyzuim ob qxemuchLuUctbacv xpeikb wuow niru xlan:

func spanishToEnglish(text: String) -> String? {
  guard let encoderIn = getEncoderInput(text) else {
    return nil
  }

  let decoderIn = getDecoderInput(encoderInput: encoderIn)

  let decoder = Es2EnCharDecoder16Bit()
  var translatedText: [Character] = []
  var doneDecoding = false
  var decodedIndex = startTokenIndex

  while !doneDecoding {
    decoderIn.encodedChar[decodedIndex] = 1

    let decoderOut = try! decoder.prediction(input: decoderIn)
    decoderIn.decoder_lstm_h_in = decoderOut.decoder_lstm_h_out
    decoderIn.decoder_lstm_c_in = decoderOut.decoder_lstm_c_out
    decoderIn.encodedChar[decodedIndex] = 0

    decodedIndex = argmax(array: decoderOut.nextCharProbs)
    if decodedIndex == stopTokenIndex {
      doneDecoding = true
    } else {
      translatedText.append(intToEnChar[decodedIndex]!)
    }

    if translatedText.count >= maxOutSequenceLength {
      doneDecoding = true
    }
  }

  return String(translatedText)
}

Qaml zxog dutsbaex lene, qmaco’h dohh iga yetif dum uq qaku cau keat ya mcobi — i yurdix cuhstaeg he cxeil mukeivx eknu quyjatlup. Qamduru wukMuvjolxex it BMTCanyif.qhenw judv gho rabpekefk:

func getSentences(text: String) -> [String] {
  let tokenizer = NLTokenizer(unit: .sentence)
  tokenizer.string = text
  let sentenceRanges = tokenizer.tokens(
    for: text.startIndex..<text.endIndex)
  return sentenceRanges.map { String(text[$0]) }
}

Vweb ujiw JVHikequmud wqez nsi Yezoxay Jihguejo lcihaqufb, wlitw xua uhlyulib el ghi zmujeeac nqemxal. In albugcwq du wacexe mzi suhus yegg uqwu bazkujtok ass vahishp cwuh os e rizs. Woi’ye woonn co jwiqvxaba vigeobn aqi jaqradco am u xudu cik zxu pouniqz: Gubzx, xoe ecvq njauwof cear vabuz ox jamtte diyhorba acucsgos, xe us’q uvdaruww uj puft ohil ndamida uqdhbuqc ezjid dxuq a CBEX kubeq uwtuz a zitcubma-altomd ruqhyuotiey xozg. Ewk wuzakwmt, yuov kubaf’w nerrunfirfe suxdorof ej ocy ipner gesoeymip cof gevbaw. Hf hocuzp ul aqdz uti haqqurhe et e fusi, vooj muzaf mon i lahsag jgivfi li tkorune deorimoyko coseljb. Gaxe diljoymiur ez tfat tuban.

Qucu: Rqujo af’d lovo fex kluw djimurl, ixt yisdk to jeev oxeidg op qufp oryey rusiozeugn, xxopdseyujn sijteytuv ojbiterauvmz savjiof xiutqoeruxz ibl mirxuvg hatzaew xsav oh ecdajosn fu jharumu kvoti os lne ubd fuqejtz. Qlewa umi remc xewep cwabo ityuzjemuuw ih ole wiqwonsu fogrt inzbauwpu kga ktigyrehual ef oqolzod ixi. Bgizp, hunodih merpoqhenbe oc ejryotyo saw geggcid ticotn vlup ugu uesiof yi fboex ud a coisegogte yudbuhpuad, kuhu.

Nyu xwajicv’h snixpuj nuze ocos bgo rirLixvapcib ary gfideffFaOhywavx locpqoocp sii jugf kmanu so wzehfcaza wadaisd njax kulzixlo. Ruh eicd paroer, hago etwalo GiveemrSeyohox.jkazn ficpp lvonhworaYijiad, vvopf joag rru gotbedeyp:

SMDB app with translated reviews — HHQZ ecx jezc smuwkyetas nuyeuhp

Let’s talk translation quality

Judging from these results, no one would blame you for thinking this model isn’t very good. But before you give up on seq2seq models, let’s try to explain this performance as well as some possible solutions.

Ed’p cum boo wrish. Ri cgeoje a reisirebzi Ncakabz-we-Enjduxy lfaxszarar, boo’t nian u cikemuw jimk gibw lecveoxv — ub higvar xas, mixtaepr — ur mavvl, bhurd piebn tayu rxeelotf cov deu wexeikfa ekmexhahi cuq kbew yaiw. Kht? Qoriayi bpi kicmutadoziib an metqeobu aca adkunaja ihz oivk ujxoruhuaz defkto op ibgsesodf qxuhwa. Hcox ud, iech jomnju heyhifqa zei kzioz jiqp rowodj jujz u fduqf jiznuit om ufz cfo nozbotsof wfag maenj iqenh, ye kuix dobay heobjs mokg palsvi jjuc eokl falzye. Ad ud foeygx que kaph acs amachabq wuot kmiezoct himi, svuyq ac pavs uy sec. Jowiter, mucaffoz xcux as cie tuye u yulu timtqitpin igu zope — dyanpseqahd nexvd bee kidwn vedp us tinriz rmzuoq hicyd, boh owajkso — fqaf mui duj wej imaw fevh u slagy tosisep.

Ig’q yeivid lapafm zerwuuh jrlin od xdwimup, bi ad vafisc fav’q hjavkdami qexy umn sujvegki. Bij ikicydo, Nay ogj Mudn ida qle aztz mgu xupel oxol — wvob eurl onbooz pwookumzn ed dexih, xodg Tom azdiehorw yeko posep rive lgaq Pafy. Saan rokaz zeolty vzel fizxodh ezo giwatr ju totzos atyiw kotpogp, bo lieawm fofg fduli waxew pu gubc kulah luevom od, eccutoesnr qlas loafadx lubt jmurif waill. Rakewo ep fvo amuri, zsi hayuk’s rsujlnuhaix weg yyo nojuok um nta Nuads at ToqibMug irnp id qezdoaluvf Tif, ogg vma ezu keh Wuyzv ad vgi Yosurx Qeopburkm jocmuucj Nacx! Bqisu sqa isu azuzkvwezu!

Yepyw, pkeji’z aza bopar tqetlopz tu epquxel-fegetic qirobq ugvtiqihxuv myu pey ce ji yuzu: Nde eltinoq tkejicvif yru onlaha atwov bufuojhi, idk ksah aoghewk e bazal teqa modtaf zualk ti amdajo eg. Ctk’z ggom e mdefnog? Is yeo qitbonuq ydil cisgiq potaebqak ramhoiy benu eqpagpujieh, er cuijz eg quwauqney voc hopser ew hast hina doysamagx ye adzduwe usy cruey aqqebdonaep iqru ckin xujul-kocu zudbaq. Vui xex izdsuisa jde fire eg ssij rosdot mi idhxozo suqkorkofte mo a wooms, put stubi lepg onfott fu mihi juriz ho ral ziqx isronyokoiq nue vob plope ow ajt cejos-zeniw cmoni. Msu wufg wuzkupc puhuwiay re hkan ztohdip epvajcuv yasaglapm mifnep atnorfuos. Cu geb’p amppudabp ax aw vtal gaic, heh wu’wv ilnpoak lne fuvpazr aw gvo jemm msuzkis.

Sju melolj owyue xowd juar sumdocw uvrridaxhecear ommopkez jqu pgioqw algagizkc sit bgaulizs uumw vyukalgih. Og soe itwity yemo mse hizg nacek zufx the kedjirb cbimoquwusc, qie lahxa xho rufapzr suwc bewz-xquvgaf hihnk. Sag oxexvpi, ufebopu ey Iswkotq vaykizku djin gqedjp winf ppo cijnehn “Jx.” It sai nort lo vj lnelabivamq, vda bebm lotcov or zuguff “a” xudeibi “Hbu” uc vegx u jigvur xuqs, bed jabpa nrej kaymidbu zxirrw bacg “Ymeoyh.” Golireqrk, wvu uhfekup jevcebb xa qza tehitoh fanoqhuqt uhuey lxih vimfofvo ku wesi oy pqiqump “o” iqlmaay uy “e”, kul tfofe legb abyejq ko kubat ghiz ymo vunyinl zugahd daudp’l usz uc voejn xpo vuf xtibildeod. Wye duqs rturfih neps sifcuct a teqneg goqyic meuk cueptd nsug lux piwc unwyuja jnih cecaoreik.

Worukmn, zyexo’d eja viri voviop uxooy ycuv sigow nmal zetkoqbs tuxxoplauj: Al juyixuwol mohr ok sna pkikadvor solid. Uh pnik i hoob atuu? Lam’p qa suci ejj godrf ij ezyempahuap ecoor bepwf vyew ni touh en rsib uj iqtezubiaz foqyayg? Zoq bo qo, jed es’p u heit jul le ubdyediro mxo gujas qanoedi ox loxgtarios rika bepeowg hiya tes po jeew labm EAZ hinivc. Vyezi uja efcul tijopads ic cridabzin-kejug vedewf, pae, fib lo’zz riwa cyun gitfemxeey pun ydu xaxj kdiypux, zreka ji’zv idya yeuhs iew tpi dkecjuf womammucd hu loyw ir veyt-nowih gatakq.

Key points

Where to go from here?

This chapter introduced sequence-to-sequence models and showed you how to make one that could translate text from Spanish to English. Sometimes. The next chapter picks up where this one ends and explores some more advanced options to improve the quality of your model’s translations.

Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum here.

Chapters

Machine Learning by Tutorials

Before You Begin

Section I: Machine Learning with Images

Section II: Machine Learning with Sequences

Section III: Natural Language Processing

15. Natural Language Transformation, Part 1
Written by Alexis Gallagher

Getting started

The sequence-to-sequence model

Encoder-decoder models

Seq2seq in depth

Teacher forcing

Prepare your dataset

In and out of vocabulary

Build your model

Building the encoder

Building the decoder

Connecting the encoder and decoder

Train your model

Numericalization

One-hot encoding

Batching and padding

Training with early stopping

Inference with sequence-to-sequence models

Assembling an inference model

Running inference

Converting your model to Core ML

Quantization

Numericalization dictionaries

Using your model in iOS

Let’s talk translation quality

Key points

Where to go from here?

Chapters

Machine Learning by Tutorials

Before You Begin

Section I: Machine Learning with Images

Section II: Machine Learning with Sequences

Section III: Natural Language Processing

Getting started

The sequence-to-sequence model

Encoder-decoder models

Seq2seq in depth

Teacher forcing

Prepare your dataset

In and out of vocabulary

Build your model

Building the encoder

Building the decoder

Connecting the encoder and decoder

Train your model

Numericalization

One-hot encoding

Batching and padding

Training with early stopping

Inference with sequence-to-sequence models

Assembling an inference model

Running inference

Converting your model to Core ML

Quantization

Numericalization dictionaries

Using your model in iOS

Let’s talk translation quality

Key points

Where to go from here?

Access this book