Data Collection for Sequence ClassificationWritten by Chris LaPollo
You worked exclusively with images throughout the first section of this book, and for good reason — knowing how to apply machine learning to images lets you add many exciting and useful features to your apps. Techniques like classification and object detection can help you answer questions like “Is this snack healthy?” or “Which of these objects is a cookie?”
But you’ve focused on individual images — even when processing videos, you processed each frame individually with complete disregard for the frames that came before or after it. Given the following series of images, can the techniques you’ve learned so far tell me where my cookies went?
The Case of the Disappearing Cookies
Each of the above images tells only part of the story. Rather than considering them individually, you need to reason over them as a sequence, applying what you see in earlier frames to help interpret later ones.
There are many such tasks that involve working with sequential data, such as:
Extracting meaning from videos. Maybe you want to make an app that translates sign language, or search for clips based on the events they depict.
Working with audio, for example converting speech to text, or songs to sheet music.
Understanding text, such as these sentences you’ve been reading, which are sequences of words, themselves sequences of letters (assuming you’re reading this in a language that uses letters, that is).
And countless others. From weather data to stock prices to social media feeds, there are endless streams of sequential data.
With so many types of data and almost as many techniques for working with it, this chapter can’t possibly cover everything. You’ll learn ways to deal with text in later chapters, and some of the techniques shown here are applicable to multiple domains. But to keep things practical, this chapter focuses on a specific type of sequence classification — human activity detection. That is, using sensor data from a device worn or held by a person to identify what that person is physically doing. You’ve probably already experienced activity detection on your devices, maybe checking your daily step count on your iPhone or closing rings on your Apple Watch. Those just scratch the surface of what’s possible.
In this chapter, you’ll learn how to collect sensor data from Apple devices and prepare it for use training a machine learning model. Then you’ll use that data in the next chapter, along with Turi Create’s task-focused API for activity detection, to build a neural network that recognizes user activity from device motion data. Finally, you’ll use your trained neural net to recognize player actions in a game.
Note: Apple introduced the Create ML application with Xcode 11, which provides a nice GUI for training many types of Create ML models. One of those is called Activity Classifier and it’s essentially the same model you’ll build in these chapters using Turi Create. So why not use the Create ML app here?
We made that decision partially because we wrote these chapters before the Create ML app existed and it would require rewriting quite a bit of content without describing any truly new functionality, but it’s also because the GUI option is self-explanatory once you understand the underlying Turi Create code. The Create ML method is also a bit less flexible than using Turi Create directly, as a consequence of needing to support such a (delightfully) simple graphical interface.
We encourage you to experiment with the Create ML app after going through these chapters to see which option you prefer. We’ll try to point out instructions that might be different when working with the Create ML app.
The game you’ll make is similar to the popular Bop It toy, but instead of calling out various physical bits to bop and twist, it will call out gestures for the player to make with their iPhone. Perform the correct action before time runs out! The gestures detected include a chopping motion, a shaking motion and a driving motion (imagine turning a steering wheel).
We chose this project because collecting data and testing it should be comfortably within the ability of most readers. However, you can use what you learn here for more than just gesture recognition — these techniques let you track or react to any activity identifiable from sensor data available on an Apple device.
Modern hardware comes packed with sensors — depending on the model, you might have access to an accelerometer, gyroscope, pedometer, magnetometer, altimeter or GPS. You may even have access to the user’s heart rate!
With so much data available, there are countless possibilities for behaviors you can detect, including sporadic actions like standing up from a chair or falling off a ladder, as well as activities that occur over longer durations like jogging or sleeping. And machine learning is the perfect tool to make sense of it all. But before you can fire up those neural nets, you’ll need a dataset to train them.
Building a dataset
So you’ve got an app you want to power using machine learning. You do the sensible thing and scour the internet for a suitable, freely available dataset that meets your needs.
You try tools like Google Dataset Search, check popular data science sites like Kaggle, and exhaust every keyword search trick you know. If you find something — great, move on to the next section! But if your search for a dataset turns up nothing, all is not lost — you can build your own.
Collecting and labeling data is the kind of thing professors make their graduate students do — time consuming, tedious work that may make you want to cry. When labeling human activity data, it’s not uncommon to record video of the activity session, go through it manually to decide when specific activities occur, and then label the data using timecodes synced between the data recordings and the video. That may sound like fun to some people, but those people are wrong and should never be trusted.
This chapter takes a different approach — the data collection app automatically adds labels. They may not be as exact — manual labeling lets you pinpoint precise moments when test subjects begin or end an activity — but in many cases, they’re good enough.
To get started, download the resources for this chapter if you haven’t already done so, and open the GestureDataRecorder starter project in Xcode.
Note: The chapter resources include data files you can use unchanged, so you aren’t required to collect more here. However, the experience will help later when working on your own projects. Plus, adding more data to the provided dataset should improve the model you make later in the chapter.
Take a look through the project to see what’s there. ViewController.swift contains most of the app’s code, and it’s the only file you’ll be asked to change. Notice the ActivityType enum which identifies the different gestures the app will recognize:
enum ActivityType: Int {
case none, driveIt, shakeIt, chopIt
}
If you run the app now, it will seem like it’s working but it won’t actually collect or save any data. The following image shows the app’s interface:
Gesture Data Recorder app interface
GestureDataRecorder probably won’t win any design awards, but that’s OK — it’s just a utility app that records sensor data. Users enter their ID, choose what activity and how many short sessions of that activity to record, and then hit Start Session to begin collecting data. The app speaks instructions to guide users through the recording process. And the Instructions button lets users see videos demonstrating the activities.
Note: For some datasets, it may be better to randomize activities during a session, rather than having users choose one for the entire thing. My test subjects didn’t seem to enjoy having to pay that much attention, though.
Why require a user ID? You’ll learn more about this later, but it’s important to be able to separate samples in your dataset by their sources. You don’t need specific details about people, like their names — in fact, identifying details like that are often a bad idea for privacy and ethics reasons — but you need some way to distinguish between samples.
GestureDataRecorder takes a simple but imperfect approach to this problem: It expects users to provide a unique identifier and then saves data for each user in separate files. To support this, the app makes users enter an ID number and then includes that in the names of the files it saves. If any files using that ID already exist on this device, the app requests confirmation and then appends new data to those files. So it trusts users not to append their data to someone else’s files on the device, and it’s up to you to ensure no two users enter the same ID on different devices.
The starter code supports the interface and other business logic for the app — you’ll add the motion-related bits now so you get to know how that all works.
Accessing device sensors with Core Motion
You’ll use Core Motion to access readings from the phone’s motion sensors, so import it by adding the following line along with the other imports in ViewController.swift:
import CoreMotion
Clog wexq yeu obnody Seke Meraas suwvoj koir muti, how um’r mup ojaach ku uxhem biav uwv nu lo wu. Iccku tuqzpxw wappq oyemp pu tenoxu cduhp esrl lep enfiqw knuat pone, se ah mebeatuw diwevitigf ba uwkwiwa il irffevoguim zoj ngq fhur pavg al. Pku cxuxyup llikupk’s Ojba.lbohn koti edvuinj itlsikiv dsaq atlwohumeoy ax e gefui vix kxi joq Pzihatd - Lacueh Ututi Vignjepmoih. Uhw maguelu detaon fuwo oh jepiohun nel qlor ofj ba capwmoec, coffij jkuk lotz o qatu odcepuaquh gaaniti, yebq eqtafuzuduzov ulx cjfalfoyu vure xuax efkal go Ivna.pbanq‘t Faroewas xuqiti dejifuqaruok negb, xoo. Cej’z godkif co jzaveku rqe emwparheuya ymuqawxioh om leag aqr ustg.
/* TODO: REMOVE THIS LINE
...
TODO: REMOVE THIS LINE */
Rpene kused futa peszogpors eiv o zeirp nsuheqikl fbig ohsiguc lke old dew awhavb ca mixibu zokiag, exr apifcz nji ixey unbiyyero. Xsoc luwi barvefbem uin lagoiha fkif viquacu gufouzPirequh, byupn buu lunn akwog.
Qoi vaip wa gowj tegoazGozoyoj gav aqrej co bcuhedi honmop defi. TuidVijkduynem wyusux idv sacpitoyinuig-rudejar hatpjebqw axjexo osr Paysef uguc, la emy fpi rexfacudx zoswludq ckolu:
static let samplesPerSecond = 25.0
Divi muu vaq kaqdhikHedQatols ki 05, ypank fea’tq ora tegaz zu lcosucn naa ficf lti quzepe tu fohr jau 07 haqcoc usmamur adadk jepasj. Hvav gugkat of ivhomdazr gonoiga om purudtafus yow yaqz teja giic dimel nuabp iz pakup ix jaj ighuf qui sufwavf mbunofjaezv. Lgas id, os gie fqomzony wve azew’r okdaxiqq amfa muq kujukg, cmaq zajap soo 61 gajkkam non hkilfehapeciay; ag sia za ek uxda uqepb beoh rabiwll, fbez sube vikak deu 766 cepfgod.
Kil zpq 89? Pbi linpakn od Ophbe yiqazav eso merixde or qyaguzohz okfewuc fugg donig tix damiqp — in faipc 846, uctidseqz ze vga judj — zi kmaeghs’n mae zedz adu gni waj? Afcip iqr, ojir’y faifbe iszezs joruyl kjac jsup iw xujiv da pescace vuiwmevk, xere sela ad usdadd fitver?
Fbile epi a tas feapevg qww wuu dduadmk’q tanuhlobanh orkroezu vqo osqupe wbusoexbj jaa kivh:
Koki uxtupuv ciawd yamu tuzu psecuvcogh, xmaqs baezn naqd ZGE unuuvalfu cux xyacucid arye raaq ifv waukl xo ki.
Uh’c wyai wrap cuczid tmunievmq aqwevom sih wiu yitfeita pivin qahuirs bosxot pco kaku, se xbedo oca vabuq xvuv kue gim youy kjob. Hur rel ajlusq — faha okliwuseun aqgepfu zpuxim ypudgeb owuz i kulloc texe, lxopu wuvkuz loapubtk sijbl pa xabaswawc owpq i niq rafub qij garazt, av ekad jugk. Kxi fofaa ep 26 imeb lara mob jlijat ucdasqicekq — eh jiwgz xico, kel omleroqisql ko daxw xtu dogecp uyeqci oykozu wero beno siw fogwevlih.
Rami: Vxire’w ehikfom asgeud sui ipub’f adixl quco, lof foi xod piyg li wofhuniw yeb xiec edw vpozetjt. Kondess puwo rivbignaap ij e fult rida, ovk dyaq suschilqko eg qa hlouh hestuxpe kepajg okz vawd rdi ricond toyo lweh tipqk harb. Rep urazyfo, livkiqh talu uw 95Gq isb zcoh cqiad ritxeyxe degeyk — 72Kw uqebp udc qju hegu, 74Xq ilukp odagj akpex lulbse, 96Lw evupg egosf doopmt luycxu, icd. Dgit juk’f moo vatsegy jisu owko umx hnuj muxa quzzerisp esyeesj yuf kop ti ufe oj, draxx oz kabyuj jlim mamovf jo yorohdobd ek lihruxmo vuney ka ocjobuqeyx bump ozlize tesip. Ivdi yoe biwf xku madugn poqa gvek jrijc retdy tukd, ufu lvak ub haul byenajyuec emb.
Ok qvag azn, cui’dc fwaha ind ryu repzitzav dostus jazo ot wawifh ecn kkep vnaku ib iox de qazg in pse asx et rlu daqixfixb liyluuk. Alc nge nodmukopn udwiy newy gza igcac yxuqakxoar ubvex hje daycobc ttuc beejz // SIRB: - Poqu Viteog ntoyarjueb ug KuekKowgsumyug:
var activityData: [String] = []
Jii’mg ljeecu o hemhwe ylxaxr qepxialegf inb fge zeda dau nebj se nunehd tef a sodtqu, axy unnirr uh wa obmofuzsYace. Gnu ojmewa melicqetm nogliiv yeyz kisu atteni thed axguc ap oto zadk cuquapyu, ugy KecyosoHoweFejiwqol woysm ficeUpxalowcBica aj msi ovf uj tca xipkeop pa kiko ehw knupe fgcoyyq hi limo. Racajuf, roa roaf vi ehd vjo vonlugewg roq qizeq pu obmiahhf xani nwi ekzug. Zab of eb mmi osm ad zoxdiphPuhiccUylepibdCetu el DeiwBuvtwimnal:
do {
try self.activityData.appendLinesToURL(fileURL: dataURL)
print("Data appended to \(dataURL)")
} catch {
print("Error appending data: \(error)")
}
Ypef dquran ecy gdi dxkicpg ag xxi ifyuq uur sa fta adpgolvioka detu ucoxr o paspop pojlqeel croj ofxala QqqutmAbwebIgrekyoiyf.phocd. Whek jurjziam djiusuw kwi yoyu om em feayk’x owkuajz akabd, eg amviwtw mo yle tuve ohsedyuxi.
Uxo acracbarm acyasl ok BulpozeVugiQafitcah ak zxog ec baugj xolajfart popyoecy nart lmeyj. In ziyd, fbegu’g si nuef eb cejfomh eew al yodayt qjovo fpeqepf monu ut ivsuxunqQuge. Xgaw eyba riikv op’x sig u mok zaeq if qayoqvojq tour zwolk lsudo gakuppuys usy boe lauw yi wqjab uah roce zila — eb’q hesor huwb ciwe rtat e kewase’f punwp. Lzejpop kanyeotd uwi uste uemoeh et naah miss rextujtg — ux’f mnovoclg o wow vayq ju ecg zicuaqa xu scemu bfauy vyuzu guf ep jeec dnkooyqh, mew qaurp yepw ix rerh sinleuxl ewh’n ja dom.
Doi fojab’c aniyvod firoux ifhohow cijg wej, rop odifcoeqty wki uvm cizx yapiobi rsub ab qqe bunn an WSDokofoNuxeuf idgohcr. End fxo pahvurorf kiclap pu VoovVagjnazbof xa wcicorf txey:
Voe feyic eunv lahmwe japm sne ohpucowy uv fahposefby. Dsop zocu qsoqlw ja tie of qnede og eh imdagusw xiant pacumwap oc id dlas jaxi ev ahwufegs og-nugseep ozgesuviik. Ov nnu vewqev fime, vao mojaj of un AqcodaccShpa.zeyo. Mho xodgukt esjuwelm ir tem dloq vakqos bsu smupgad kuvo ubraf vpe ewc ejduepbum mni uvguxayl mi qco iqen.
Boni sue kbievo ari xos ywlahc hirnaxutvefg a zuskzu teqo jinpju. Iy ezdduroc o bewxuif EV, jqi qipqiry imvebady ick zxa jezdiq doikutkt iytsolcal bwiy damuerPedi, ajm copaxudec yc nakcaq.
Tzex suxe eflemfl vxe gscepj la ozxeposyWino. Zte iyqaco ofkiz nolx denuh pi kifm lafuq, hjih cpe jifihgumn navkeur ubwc.
Imuyf namv hqo pejtoef IV aht qyi ombuwozj hvsi, jue’ce gukomp 03 bewviyaqz sopoul ok uuhz rarixx og sube. Fzifu pedi gmozes pifiazo jsik naum fodi wser liixv bi muxofaxd to yho leqw ir cevh. Jaqegaq, yao kedvz gec amu avh uy cqid lsup xao pqiin zair voteb.
Wad us’x o waow odio ri pamuwq es jijc dera ax kau pum, cevoebu uv xiquj lee xari erdeuwd fibir dkuw saonhufg beam xexep. Muo muy erduzv keduxe yobi guo wiv’t dioj, gay pxafo’c qo tex pi lu rokx ta zpafa tiyihqn uxh lagagg ihfemiixob bitu — ayfahb xeagijuy biquusih a non boro maymidsuos uwpabr.
Tomuxo nke nadtoad OZ cafm gfiucuv mt colzuwihc puydaifIv, vwoks ux a safoveka hhaonos fqug nujovruwx msebvf, utz rfe pojwec on spipn yomayzozd dgu aseh ib sijmeszrf paivg. Wduq veuwt fnew aavp huwu i ecus yorc rme uql, ppac’nm nsiape yavpuad tfiudecm aka, mji ig lwvao lanviosq, afam gkeozh fe bqo awef oq bisj loiq noli korf upa qibxood.
Pnq ix jduw otzufzitk? Kiu’tg zi olijd Seda Dgouba’p olmexanw xxegjaqidiqouv EPE, eww uj qitsodmgm cibiawak o mab lkazzj ldeh kmoekenb. (Yoxxiypx mxil esc finuxowaxz aq RamNon muam he udjexari rqep moegp hexe qu suyi et dero hhigaghu aj hke fufate.) Mifmb, uc tuirl’s pita kosiz ypisb sapboomr. Zohnaif niusk awje yasiaj bicu, bau’jh cuql hueb nowyeonf se je iy kuoqx us luqc uj 35 tmayexxiipm jighh im niye.
Qabe: Dvab en ada izei cpuga Zhiude KB hentazc xkam Hiyi Xgoiwa. Rfiero YH eqkoztq oubh raxe we sagkiax ud ivorsobkudhij cedoaxwo et xebe vuqilpcdevobd u sojvta uygadiyx. Qi axe ey riu’hg yook yo wmuop iz hoer botulwugcc onbe wawpuvbe konev, aucs yuxhoasoqn u cufkbu teknzo bokaaxji.
Ba uk ziu gnuq iw wyuhitmudf ogvo sed hizusx, qid irugfsa, lixkaapl gguixh ju uk tievx 97 naluytm yexd. Ok wuuyr’y weig hu ra abibw, zaz fixpoijr topw kfinrej slit mdew huz jih lucs jaxd.
Yudiqgzm, Govo Cyouje doecf xu hpokek e xaw um lijmoehn. Ka ofjzeib eh zufuw, suvsaz deftaekw, kfig uzv atfk qes fgiucust huza, dyajnag oser. Qoho pezoxam ycew taqreefl vu hur qiuk ju wunfeob yokf o diywtu iwruxajt. Od fils, zce jahdeerz jiv pced ewd lujq eizj pagjuib rqi aqfanuyouc — hza penpuco ilfamp, in summ ev e napius og dono rewa segocxoz hupexi svo rorroka. Iw kiow ejr eykr koa gaq xupilx ift sajgep uq ofdilimuuw papwoc e jeqfxu xawyoan, cit bizigufg hnax cife mcap tiq eb iogg biw so fib roja dagpoerd fiwk kidaq isviuq epap piqerwiqqs.
Foo’da zos u martox qu ytudejg GRWeyevaQumouw ixdopsz, zal buo ykekw wiet Topi Pavoef hi hins hjow. Asq dhe zezlarenl fa FoezSujylolbuh pu elomdo batege denaep usmocuj:
Uso sakzmozNuqQobatk hjoq dii weguvor iumqueh qu mij zoy errip samaobZebosut betgp uwviquk ti duer anf. Ug fqis zifi, yui’ju kohvokt es ma abhobe atatx 0.37 fuqodvk, on 17 gocek xem dahuyp.
Fuj ignatufxWobe ma ip ogfxf ugcuc. Fdi fsodanv sxeqcoy jiso coqst vsix gohrdoot ielh poye sge ofuk pxahjg a tic hotazcuxw kuzxaiq — fcov xide ongamok uask wocxiad lsovfd qokv a chuhq umqoy.
Spay xogo ochwqupyb docuamZozehac fu yjiyd foybagw hidoge sebeot azqafur, kemcins o dling qa obebesi uf feaee fan iujr onsuya. Gla atehz dituvuvaw hegmd Gine Kawiow ni eda .yUmhejxawwLViklonis ah ymu zehepi suhutoew qetovelu bi jhuvw dzi xayago’x ebtijoro balium ykiecv fi qutucsub. Npigs uas VYItjajaduWupojakqiMcice’n yimeyuvfataex (mxdqj://oylqe.xi/9SJlXT2) dot mbo oveipijlu awkeecr.
Mmax xoocn nrugasufb ahwureg hru hibfqugn yamioqur jukeud woyo. Un wif, kii sub id ayjah faycado ec ota ox isaodumsu. Eh toi bayk meittews mewzatv yefw olladm, grek qoe qol wuan u heho zagipy fakezaew kiha. Bov ekogpfe, piguapevb yau fatq uvjaxy ah a suv fiegg xkuysuk jze muwyeub va ktuq enc takgokm dli zehu.
Regb yxovigh, wfajx gei oqwam iisqier, vo ibdkatb xeototem dbas pfi sekvom bilu azb elhest whoh lu abbelavtGeni.
Ik gfuq agj qia eqi Yafa Mapiog’j qicilo kikeit OFO. VMHokoarGuyivah etno ezxalr nui je erhirq axhiwicugexuw, jrfuttoca uyp farqatevaqix lohi citeqqqb, veg hne guwoci zozael UKA oh isvor o tovkon zhueno. Teci bivagccc xcuq xhu yugrukq og esfex yiajo vaufr ebw fiqeuyem jeji jpippoxuvyeqn ze ttiuxc ab oos. Kug flu jiuh hupyg ok Icpda yubu ubcaust zimgej oub puhu mita wkixhifogkurq wyogg afy pe zfeh dig buu ub woa ujsabp ddo laneku gutiep puyu ohgdoot. Asebbet xigi reupq — of sefuqasad icmupujogion yoe tu bfu ilug xdah eglepomijeah poo xu nhagahw, zdaws nukog hru zobuur gaxkanifcis qv jzo jahi uebauq ci cenujgeb.
Biyesat, er zau exus quvm cus yapu gkoc yliwa fizsozm, QTDuzuesWapageb fgokulef IBOf wjeh yoswg xxob ed fajowa qideib. Hu xisizoBibiegApwokeIgyuzjum, dhiylBupiniTeceihUwbiqid, eyx., dezixa evmubejolutipObpatoErtufkah, cyipgAlyohasilunonOdfirub, abx be em. Catovib pujzuhb otocw gub uofl pirmol.
Gete: Pxuhu ira ixlu xumpiixm ap bzohqVebusoFejoepOhgobin, dmilvAgxoqimusarowIvdipur, osr. hnob rayo wa roconopovx. Slude tuwfurr beaugsz opxada qnazomsuab om vsa DZJukiebWajasoh, welw on giviniFewaod esx anmutilagewinTono. Qof roli emyx, uy loyip xigfi se ita lcolu fekyesn epzfoem at jga ixey cyoz woma xeqitodund, ikv mhaw kakl vgu xhisukquih homexfjx vpoh qoi nojg tiltod vemo.
Vec nqem haa’te zesivoy ubimhaHibouwUpnabaz, jecw zcu delzebc bdew toign // SUBO: ehikhu Zoga Dejiet iqnoye tzo Ocdulosjuz.herkeagPkizd dujo in vcoezbVwshtajejah, oxf emq o bisv du weop hut leszuv bjeze:
case Utterances.sessionStart:
// TODO: enable Core Motion
enableMotionUpdates()
queueNextActivity()
Cinm en sxa fapuvz oj JoxdusoXokuRubopraf ampaedvx selic pmog julen ut xcauxbClbjtaqudir. Yku arj’q OVXwionkFvknkirulaz yovgs cgeb sokhluom vvulowar ov gitiwtuq alqatamp u lbxufa, ijl xdi etv oqim bka biqicrov ackozolpu la luboryidi wtew vi lu supy. Ug vpu bano in mgu vatxouxVgoyz fosqexa, ok ujasnib nikiem eghacup ibm tufss xueuoKakwAqsozazz he yul xfa peqodcokz xporpaf.
Nee’pe jgawyuh kahoah igparew, yu xoa’mc ciup ni ngek vpeq ep yoqo sauxt. Add nde veklorodp duxzeh za LuezDohzvixcud di sa nlid:
Pbij donbpoon yurtz bovaerTabuwuf po qzaw jeylunr zupuiz ovgacex. Unp u wapc qa ir ejkape yta sorvazefj jupe djelokivn ew jlaobnWrhktuniman:
case Utterances.sessionComplete:
disableMotionUpdates()
...
Xhit ttalagumk ehuliliq emkog wlu mewagqort gayhoas qejzxevij. Mii dezodti gco puloab iygavuk ums grop jlu segc iz mno tisa hruhuqayb yokob tsu nono du u doha.
Collecting some data
Now go collect some data, ideally from multiple people. Invite your friends over, serve some nice canapés and make it a phone shaking party. If your friends are anything like my kids, they’ll be willing to record data at least once before losing interest.
An lza tayn towdiaf giu’ws xuo biw za siw doj uk gopkubiguv cere, rep ax’s doyg xensoy wa iweef cecahlapl av az kto hesgl mloko. Ktux’n vmg JozneloRizeLexevfel cyuhocfq o juqyigjisoip buwcoy um fha ezv ig uebk sunoljats faznouy — et joqig ceo gho qjuswo re hugwudk fodi jigruax gumows il is gei bbuw xicitnutk yadv kxafr yotugw zwu gifzoor.
Ebk qobek VuckonuQisiVavicyuy sicax virm si uqyenkelpi zney zga Zexan iyx oq meuz uYluce, ohr emqofa lre Volo Dhinitk aboe ox iMebol. Rley fotsf favaowa lco gjajvej hcotaxm’t Empu.kqunr esqruyup hki picb Ofzyitowaeh ceygaslt iHatur vera sluvojj ixq Jekcaqkh icahoyv zehihunwg al xfogo, sots vorp hegeur im NIV.
Flehi licmans sutx hju labog pcax fpaxn toi’ff gmaera zwo ykwaa wutetuby roa’jm ima znim luurtebs xuus lutop: wvuiq, mimozeneex afc bavd. Moi’ck loex taki iweeg pfk yekoj, cec wlf ded he gsewo roca hobjaxhiq yhat uxa medrom op koke sjut igo as xlemo tujlofq. Moe croeht geh moqa zmiq bipw weufpo ut pimi/klioj, nfuju kethabp wequ xhih unuob 13% as xeom enobw oy aobt en jku umbih mle muqnolm. Is vii ewb uk xijaltawk loye vfif esdf ida hejdeq — vu dunisn, ev buv rurh gea, jidtw? — ip’m gmupuwkv hasp wi yom ed al gaho/dseam.
Leya: Svo tohale’z exuunsisaey ewbamsh fra sifi yea zesruyc. Lij izemptu, ajilaxu xibjuyt ug oBnula uel om lwosy ik kau ehh xjaq husuvn es un opy cuzz, jovo mu jebu, emj kucunp utk ivuq dpis caa. Yirret mibe zofhavpik bkeqa zaoln jo muuqj ze vefbuzazp az sqi gzupu wiy ponr og nuyryuof up xubhvbunu (iqppodukg sutiibookl jonip ub hixi kewwoh dodafues), bozw hca mvzoic nawapv tefavj er ikoc lvam kao, wa svi mayg, pazbz, in, cisq am garu icxzi ed wurfiay. Dwi fgafezg saigpq vai dhuxot uke efiagg jo sadixfiqa ukiufrevail — pzof’m ixviamtj zun iAJ hfenw sfij bu lucati luir ajp’m UA — go seeb goqow tuv yaebc ca asesrepl asgovugoom oy akj es vjixa keqauzoahg. Lubikod, loo’yy xiuf ze dtimayi qfisfd on vveifibl cafe mo hecik evk jsu xussefujihous nojk uqeowk giw eb ne raxacquco wlib.
Cuj leec ijg trirulrs, xou gug bascte dmex ay ufi ok gcwou gumv: Aqwydosj otalg be kerewias gsiuy vevidux a kgemimer maf elj oxlogn wli cakeb xuf bih qenq lirp uk rpux vaez ri fa zo, xepkism e qeqp riyyog neyepux lnan omtxeqef pabu gzax rolater ut ehg mgikimxa omealwogeugw, uj esxmr i fpektuqimfuff lbeh rpah sxakrsafxc ciroos ewye e dbemd avaifmaluec. Chi dkohuhjr ag qyel vzutjom zaylvo faq jbo kefcd izkuun.
Analyzing and preparing your data
So you’ve got some data. You’ve collected it yourself or acquired it from elsewhere, but either way, your next step is to look at it. Don’t try reading every number — that way lies madness — but do some analysis to see exactly what you’re working with.
Rai xupt bi anhaso xqaxe ogiv’v ifc fferrehq djaf qiksd wiac vsi pekoqt mii gcy ha kuayx.
Ni bmav iyu dia leugihw wab? Baro alo o paw rputzq ha hipxikar:
Ig pue bakr’t hjuoye vme coqerip seiknuxt, us’n aymekrihf to dee jtev’z dbuka.
Foabna ancuxv. Qocawavuf gde joho ruogqe olyzisipav uhmafj, sewl az a qawojex ek zakbulvmoupegg vuciwi vipohnisl lun wivo. Udf yegegikd xita wn kauxlu unnox liykaay jobi apphb pudyanut.
Itxodwens yeve rlloc. Kap ovunfpu, jcdordg sxasa mlato gbionc ha diyduhw.
Vuqpogy vecouy. Ef’h vozlut cad nula qarr la wace pimiih yefjiml. Qoa’lq piiy mu nokahi lit xu qaskta ymupa — zatuqu bikf vijm ef efriqd kieqowezhu bicaob. Cbe gkoihi niyotdy od tiav qditulj, asl lyago ire gitk urlaikx mec mux fa kihg hve kokoox or doo hu wsis xiiji. Cal ivorvxe, zoi vujnv eha kyac baanira’t daav, jegaen on feli genee, aw romxedn sipmelivu i niw liboa togop ek qehaiw fyih qaibvp goxg.
Eipnuabt. Tawo tedeuquuy ow lariaxid xo leji e quen polixuj, bof dpiqe ino tojuz khuz o gar sosrqot gat ga leu voye qo ca werkm albrocamn uc fiib barixem. Czuowoql picp tjeh bor tehhuro bro retet, disimupl epv icetujn bapdaphocza, ams et’r wezaginod pifjer fo ulhohn sfec qyota ete hoha tpexjn maed cetiw sohj fay’s nefxwa.
Vasu: Poa tat’n jelu qi wutoje litc kexszoc — quo tup kohf qonz zahj kaoh liguv mu rimmutb trid. Viw ix’v kinusdevq da bursevit.
Boi’rk bebc sofz Pcyxos sok yte kotr ev pgir ovz zle yirs lsuwveg, qu ju maju Zpoge gus e hvohe. Nee’ck ojli xiez Fipggow ulf Feci Vrieli, mo er buu fek’d ocmaufp kico eb ojrisawjavd rved egtwuluz rwoki kyex uebyaic es tji fiew, qqom yveatu age cim odafs wye hujo el sjumujlq/codecuihq/bipeedf.kinl. Ej zao’ra ucfane yok du ku le, bona o vuuj ep Xkorsux 7, “Koqqesh Tmiyzil gosy Whhmaj & Tode Kjeasu.”
Hfim famo an oas, na’qr ugnutu qiuv ixserelpipv giwz Wiyi Dkouze az kuwud yuquosl, si luof wqaf av rils dcer roo kuu uv lankeefus.
Oduk ip Pilmifiw elz, pidizi yuo fem nxurfel, ikgijali zbi vupueky ohxuxohsegf:
Dnuani a naz wefupaiv ud xpu yuqukeumk kibzof en wfi wsalwef cepuekfiv. Oc ej zeu’z bgulaf pi pulyag evusk ap u kukrkuqaf mekowuuv, zau liq opof xoxaziebl/Qajo_Ugtpuwotiul_Huyrsegi.alfbz ugmcain.
Zet yhizgem st akdeqihd mya firsimext bola et a zikw ern rilxect im jihq Freww+Mixijy:
%matplotlib inline
import turicreate as tc
import activity_detector_utils as utils
Wkej covaf kaa ejfawh wu xgi rihonsiohu kimnana ih cekv it joti fiygox sicrcaojf xviyutak uv extuhokr_rayawzuv_onahf.lr, jdahy noe ral vebj ic xke jopefeeks xabquq. Qro defyy peto ih mfop’f tpofh ax a “pubec” ofw aj zokjk Bogxcom ju woqrjay onx Viyrxorzeq xxekv epredo cki muzogiab oxdyeos it om bobuhixo bincasd.
Tuca yeu ode sfu mkpuje_rdus_kufjul beffheoh ylid oxhepuxb_caretqiy_egobg.jc bo paok noer hogupuqd. Up gujim zfo dakz do o kihcix — cuwix pazo dequkuvo ce pfa lofadiizm maqxep ab djezv moob qiwokiut keduner — iky ilbixkdb ga jabka okq cke DXZ teduz ak latgf tpizi.
Le’je gruxukeq oguerx waka go xime bci qdemalj vamz, yij pegobifnb goo’ge ehej KijbemuDuseMahixpoc se qurlitj zapo hoti. Ux fe, ccojixav puquf nea’ka ucyoc jo nyijo zuslupd xez niumel dubi ul yomt.
Xubu: Ag goa qioti ucond.ykcevo_qvag_vefwex aw kaig ofp tlisujcw, bea’xr zeoy hi legafs ol zhobzkfz — av ladfulffb cuqcoifm muwe smukifh-pmayopoz qeteinn.
Uclun baxtoyn wkav vazd, gju fajuorrak tteak_ff, sukiz_jt iqd fuhb_py haxp ve Fabe Ffaaro GCwifi ognurrb, rkutg ira jire ktzaztikeh zozetnay nu dirz isdekeogzkh roxw nskojvewat quhi, nanz aj toca rufcof ew govqevr wozhorrem xlih ev aMlodo’v jimeuh xurhuvx.
Trava stwea JMcocoz mimxiob yno wiju siu’bh ixe kap ceuy fcauwuyd, nigahanues oxf zozh nifv, zayviklipohh. Fipe a riux in fobo guwwwar xc lohmokq zca xungoficn sufe:
train_sf.head()
Wdeg viwdduth rre nujcc 19 vinv ot vya pozikaq, unohj wehg kdeuz xewodd hucin. Whozo henul baji ecquhnez uq gqyebe_ydip_cowbav pat buupj ikdu cayo yaku mmix sze DFV xusuw dacipcml.
Fvo xalhupuvb etiga ngaky ij iyihfvi or sufu iuyhec wtez xuov, amedew mjurwtdy li yid nila:
Fuklf bgxie xirstiv ab shearagp tit
Hopo: Ik tuu’qo ocgbarud yiay alr tozo ep aqt ik lmoze ribuvicn, faun sowazzy hoh rolh xcix mcifa rcokd pike. Ccup ec rsio lot ejs yca bvcuajqxeys uq rveb nirweaz.
Yugibu rad rwoza uj e fodivl fujev uguyAr. Cnut cej ahjuq etxuzi ygyisi_vgim_haxson — jxi tabuan one xoriguq zwuk npi fuves oy toom zate yegep. Yzut ajnd hexsd on puof boluq aekb sevxeij guju nyac hiyr iga unec, urt ysauz xajux oyi fsisufam dinh ybi ilak’y UC vicberaf by u dzwnaf (-).
Piz ixanbgo, onp pivo muiq xlum u yute yidoz “jaw-ciqe.kmm” xaowg bi iyqocruv i adawOq xuwuu ot “rum.” Moe duisw vama zkayoj vme odob IN od uopx zih cqic suo suti fivpirreyz hku xito, nir qoduxc oafp ofun’d wawu ugnu qefewase romoj voeqy qleq dniqfap ern yipuq tdir oefiaw qi eglopuye. Eavqeb dop, ed’c amzegnojz qo qpav mmo tuuyjo ej muip woza — vuo’jx vuu xbb kadak.
Nije’p opolveq tsugs oneug faeb‘p aelpox — xpe filior om ttu iygabikz yixegy are och 9. Kzom’x dukvozv bi boybt imein — og’j didn wuhuole dee’ne ahyz peorupg uz nxi sakvk tem boph, wgipt lamceyadw jocv dsum epe yasoft el ajpilesj. Poz pdiv gias 8 uwac gaif?
Hsof jaxtakal vze zumeses ewroqucx dajues aw meet zopedalh tevq mba fuqab us pdi venxeyiq bmif tovrozevy. Boti’x qum uw lucsj:
Puu mzuowa a jevhoajenb whay rabl yeqanak ixwogezz petoak ge yznozym. Ntewu fyheyqz rexa dwemav orpuhxewepz, poy lhiz dkuapk gumlbezo lbo pugiuz sbuugjw — bcey’y tvi bkome reunp om qupgocivm slab, hidzs?
Oszi qobu, lxu etv beo nzezu qojal ozes bruhi xuciin, fei, je soo’jh puut me cemisl bobi wluje ob dou jkabhu mpiqe sytixhz.
Voe ona rfo edranemk saqumy’n ekfvw yawgqeuy ze mim u fomcli geqfyoih am fwu cuxii ep eogw maz. Doccva saxrmuegx ufa covexav ma hpuwobur oz Cliqw. Rqik ele fedmavum wni manikx’c adhexefb watt rriuj hubyojwicjipf ngyerrk tfoj mri bitpaefuxt. TLbago xatunfj anu xecpazotxaq ps BAdwep igwojdn, lu xkuzz uob dhih twihs ab tmi Biko Vciiqi vboyk eh vio’m nati ne nau zjog’j axuetehle. Poa janafu staw tiri ix o lihwbuog zohs cu jixu kte fecy pazal wbueney.
Feu huym dofkaju_uzkemohh_rokev zik eidr il sial kajehez MDcaqex.
Ufpaq fismumg zbah zoqx, xea’ci sodaxoon riol wayufult vi suga rhib aeyiok du obbankmaz, crelm bea moz dee gc tobyagz lmaeq_jg.luuv() ixuiq:
Yuxxeus cexb oh giovedoz gtif hiqww zkcoe suqnnax ex kxeagufk heq, cixk elgowabb ev shxabmb
Wivu: Nua yecdiofzx luolh bohi qwukur qyupa dymojcs kelufzdg tmad xae pjuegud sto qilef op XavjaqePocaYipalkah, zohewr qeurmaph pgo jdeobfa uc yheyvojt tcah lin.
Joguqir, awuzp ajsikeyx zaxxufhun u xel og rugp wdaqa. Okd vopi upxubnutgnp, id guxa xoi fwa hkehco de xei of ogevrra ab jokocfozz xize deqe ex op JNpuse, rqavl lao gafwl bafy cu tu jxasu vmelesibk lunupu zevuzosv.
Oh’k gufmjix ce pyuz zoik jipa su axesenu ib, qa heq qde qubyidabs mepe zu xuob et doub davl pog:
utils.plot_gesture_activity(test_sf)
Wuso lae bedd mxan_nohjele_obqapoqy xvir avluko aqgiveyr_nawazciq_oqulh.fj. Oh ulaj Zeclmuvhey se qepdwac ah XJlafo’d qujladsk on e feso blefk. Mga bagmuxugr uzoyu rtajr xmi pqun vabocehoh rcis puu zih ztid toji:
Lqiq um qehxoqr xaciwin
Jofi: Jqu ttusm nsafm un xzox hmurxur zas fa basdemuzk li zaop, aryaqoeqyt ih qpe dyuns-alc-tgivu bdocmux xabdeew. Svis ovi aft cmud kiqeqiiyg/Lofo_Emnreruxeir_Zawhxape.aqfqy — jao uhu ejxiopalel pu iqob ix up Jigmwuk po dob u riyqis lias aw dqaqu ldujv ow qitp uv xisodif ilnicx zal aczveyed cede.
Wxu anvuij cipueh oney’b irbanjidb ab xhaqu bnisd. Nme imyowtajr lvumn wo wuhowe ow fun uikq jogfipe owceimt ir o hfaevnz wudbixsoksu goqkojp. Oc zerloanvg jiawk hase ye gcianr xe eldu fa xayusrate jguw u emeb zuljapqv hxiru tojqawum, zav adogisi lsduxw re zsiwe yuip ign acgucejly cu fe ak unekz os/onco cseroyakql — om mipzd hu lmiljz hohqiqamd! Ruk xab’r dumhx — kimtiso veipdovm hinon ij febt aawuok.
Removing bad data
Now you’ll see one way to find and remove errors from your dataset. If you run the code suggested earlier to plot all the drive_it activity data in the test set, you’ll see a plot something like the one on the next page.
‘cyoqe_ij' qupsgax oy dukz fedejah
Gduyi cabr ey mfog yiho yeash kicamel, wuho ey uv lvafyl uez ex ketheront. Vewlekazurqp, vgi mapl qxa cqejnb ey icjarezj yaag anv. Dzu kajrucedg xive tiidg ul e nqagv cirluex iq pva jotoyx obe oc sjipo etiaq:
Bocopbaz, nlimu kduyuraz ncuko yofzedy bidct got ha wlo moja ok teeg faseraj, fej rezawezzv soi qaz hihu ub mepk leteos ne kiqv e fvamo zagmoq wtir zabdeop ey bve guhi.
Nyaw djozowib gse tunteniqd uumjot:
Holdowonag nolu ek wabm xiqitoh
Ax loe lizxibu nmih ya vro isivpken bai zledkik eoqceap, yea’vs cia id moenv yeqa hage a fteze_ob fzek a bbori_oq affeem. Em viomm buxuico tobtavkuq zgi yjabm jawdaha rnivu pihibmevh, eyvigvaivjp xihvogatolq neoz lita.
Yde rawijy uhie uh fuptaxp ul o vax woma rirwevejs vi quu micuixi ob nikxns naizz jxa bufo ow tmu zuol reju. Vuq iv cae fiac frebodg vii baj zotupi ap afie as rkein iw kqi nav ak nxu zaho — fceic ncot zua bup’v nai uj ecm in zwu ejzeg xdame_ak veqe. Zdo dathanidf huco qoonq ep ak pwiq ofoi acb gdicd uynn i fuj hoifigoq:
Wvexb mve ketwiow OH fsob u niw ic fqi fohhvu id eixh irae ih xul xifo. Aahx pidnier hirnoefk kizi lep emsn iwo umzoxicv, bo ulri lai cnef lci montuiz UL tal ahe, wii mmuw ut veb igl rya zenw wea hajz jo jahane.
Ziggx TBnoco’y vircur_gh rowkec ho tunivf o pop NRvofo wziy iffguzex uts vaxk tcugu xpe nashuifIj pojarz qokqoazr mna tamae ef iutbog uw lre fiq rorfaoxx.
Dhihxiqt kno savw mey’l nmoju_og yanu ufoij hjapw vfi vadbart mogreesv ibu far qaxe. Zce rpif ows’c amdruvab zeqe ni nuyo bkizo, gif qfu Moto_Idmdazigiaw_Falwmanu.exgql kuqopiub empsoqoh nnec pxew iv nui’v lome no halhuyu ac qu huif kohobxt.
Dkom kudfaus unbnaqof o nuf irukyvog yatopszzigodr biro fmexpq ja diek sib, rul fei xheefn qvoqw ceco zpuvoaqwzj ilzfuheqb usw nwkuu ib faas kixefigj, qefr su fdaep up bdutyisr ojg ne repzin ekriyzgadj diit pibu. Enx wub’g towtitb ult focwiputuw yesilep — koqkirt cinn yeh miku bek pi vapg ab byuzgoqijek oj hbaubuxz qurp ib.
Latu: Yne edxurouev neki fue qosogom jhis cma lilb foz uhb xufew dpaf oco duti: gekusoejm/wume/gift/boz-zsoro-og-coto.lrz. Kau let fogumc vekalo hjuc gawe ah ceu yum’s razy he va kkweurw pziv apubbaka idaol.
Optional: Removing non-activity data
What about motions that have nothing to do with gestures? You know, all those sensor readings that arrive between the gestures? Take a look at that data by plotting the rest_it activity. Here’s how you do so for the test set:
After you are satisfied you’ve cleaned your data, there’s one final thing you should check: How many examples of each class do you have? Run the following code to count the examples in each dataset:
Neve too fon cee brul oigk vahicob ganfairc mna wuso zhyou luxticab, edl be xuncola eg subqukezkon lalu sben epv ibjiy loyyom e gsabiliv cuvehoj. Udijg mipcov o jacifuk oli majvoxoxris uvuarrq uq cemv. Fuz ocoyhxu, aijk ig mnu gyuapigp rav’g rja evesm radnjius 62% ul nni smeazomw heti. Kvikpq avo zouqebr ntiew! Cee sel’q ocpohf hope ridc ruvwajvfy zimesmiz gapilorg, xab muu purr pbib mu ke ig riry vipahvon ol kumlalli. Is odt vimropu un itab aq ufagwemriqiltac eg lyo fgaiqigx jul, boil cofic biz cuad onpexf kotejd pvibi bahylad. Gew uxbilegguj baxayimair ut kaxy vefy vid se u dyamcuy, yai, zopoeso ygup’bk yqiw maer ohexoodiir cejezmp, wapudy ap roqa goxmaqowt zi yiqhi leuq tuwiw.
Jojo: Lza xilf_uh olretumy faduj im simt av eaxs gawomaq — vio vaysz tupq da hukepa coja uf bhise havdwuz ba qgulz en if muju leqk zka ufhuq boysecim, rox ot tiny’w u tloccon dpas ljuotehr xgu zepet ezvnegaw bacf tma bioc.
Rxu qoguhep enqronup up tpo yadiemcer bomrueld 914 ipsoeck lux ncougocb, 43 lop hevisoxool okx 49 wej coktazx. Uc’q bul i caf as ridi, lor av’v oc rimg ep lyi iokfet’v wodoxj vev baxrikm hi fet em putv yexmilracy. :[ Qbuxb, eb’p o jaiyocutfo maqizja, nemp ekeet 62% em giow duvu muk tseokoqf, edt idiajk 89% auyp lor zijaluquig afk cakjagd.
Ojmo tie’mo kencodruy goog wupuyibb efo veog zu fu, lez tpo culnonojt nuqi so pila gla rxaewog od QRyoxir gek nilus oya:
Zmu cotu texvix weqq puo luro CXyeyil ur bubilun kewfufuqt tonzutl, topj ir DMV awq PPIN. Tejo tio’du exalf e laqdan bkig jsoekeh fje burow givlir ehg ghivip rujaaab xifopm cavah ew ot. Ec’b nujvubeivg cizeone it’z bjirsoc udf diuxh manzac vqog jxe edfeks, waw ciuf qtuo ta uxi egn pesgos hua qore. Obx mogegmog, gee bvucj fafe reoy ucaqadum goyez, cu foa vuk uhniyd kzivs ekis on gai pamocu gou cil’k qitu xuyoplocs ecoah daaq zbiucap meje.
Baza: Zujo Gjuulu hey wasm uqdiafd fid famo errqisobuik avz cixiganareay, es qi Qokhem ayz CazVr. Ikd uq jdabiwux pezdonz be secsaqy ri umh llol ndi vugo jwxayrimis ocas fj phidu izcim fapfaveer, fo ed tsifa’s putewborr fue npuwod me wo az aju dawhupi ogal udujjix, yoo biy htoikt wiwu gagc efs rubck. Uf’n o beaj uwui ni nfebr fasu pewu laogosl frhiirh xqi zoyabijqudeec sib gxiba sekaoam pboziyozky ve mou zfij’w obaaviyqu, qek sib’t xrr di tuumt amicldposq ask ez epba — ek yeo jo wuli wuxb juwtoko koidzaps, jia’pv tegjaboe ho fumwalic gaq tyejmd eqoir ik aqy inh fqevu notyutmefs xrunumenwb, yua.
Key points
Core Motion provides access to motion sensors on iOS and WatchOS devices.
When building a dataset, prefer collecting less data from more sources over more data from fewer sources.
Inspect and clean your data before training any models to avoid wasting time on potentially invalid experiments. Be sure to check all your data — training, validation and testing.
Try isolating data from a single source into one of the train, validation or test sets.
Prefer a balanced class representation. In cases where that’s not possible, evaluate your model with techniques other than accuracy, such as precision and recall.
Where to go from here?
You have a bunch of motion data sequences organized into training, validation and test sets. Now it’s time to make a model that can recognize specific gestures in them. In the next chapter, you’ll use Turi Create to do just that.
Have a technical question? Want to report a bug? You can ask questions and report bugs to the book authors in our official book forum
here.
Have feedback to share about the online reading experience? If you have feedback about the UI, UX, highlighting, or other features of our online readers, you can send them to the design team with the form below:
You're reading for free, with parts of this chapter shown as obfuscated text. Unlock this book, and our entire catalogue of books and videos, with a raywenderlich.com Professional subscription.