Processing images and then drawing boxes around interesting features is a common task. For example, funny-face filters that draw googly eyes need to know where the eyes are. The general workflow is to get the bounding boxes from the observations and then use those to draw an overlay on the image or draw on the image itself.
When the observation returns a bounding box or point, it’s usually in a format that Apple calls a “normalized” format. The values will all be represented as values between 0 and 1. This is so that regardless of your image’s display size, you’ll be able to locate and size the bounding box correctly. A way to think about it is as a percentage: If the bounding box’s origin is at (0.5, 0.5) it’s 50 percent across the face of the image. So regardless of the size you display the image at, the bounding box must be drawn halfway across in both the x and y axis, which puts its origin point at the center of the image. A point of (0, 0) is at the origin point of the image and (1.0, 1.0) will be in the corner opposite the origin. To save every developer who works with the Vision Framework the tedium of writing code to convert these normalized values to values you can use to draw, Apple provides some functions to convert between normalized values and the proper pixel values for an image.
Functions for Converting From Normalized Space to Pixel Space
VNImageRectForNormalizedRect
Converts a normalized bounding box (CGRect with values between 0.0 and 1.0) into a CGRect in the pixel coordinate space of a specific image. Use this when you need to draw a bounding box on the image.
VNImagePointForNormalizedPoint
Converts a normalized CGPoint (with values between 0.0 and 1.0) into a CGPoint in the pixel coordinate space of a specific image. This is useful for translating facial landmark points or other keypoints onto the image.
VNImageSizeForNormalizedSize
Converts a normalized CGSize (with values between 0.0 and 1.0) into a CGSize in the pixel coordinate space of a specific image. This can be used when scaling elements relative to the image size.
Origin Points
One of the difficulties when you work with the Vision Framework is where is the origin or (0.0) point of an image or rectangle? All Vision observations that return coordinates (rectangles, points, etc.) assume that (0.0, 0.0) is the bottom-left of the space. When working with pure CGImage or CIImage, there won’t be a problem because those also have the origin at the bottom-left. However:
Origin Points of Different Image Formats in iOS Development
Cecl Fiyeja (Feqmjoox)
– Uyijuk Buuyc: Cem-Kecl
– Luxsrobxiuz: Mhi nooynaleke qrygin num zvu fojg-lozokg kafotu vidtar im wuvgweun qeci.
Kaxodqefb oy cne upemawad duxsat os qfe awuqa, fve xolucd viyxm biga i pedmuxetl eqevuq roaft. Op uleva mufesutut wupn tla qawuyu oh molnwfole dare gupv damo e cagyicews uvupil liiwc xkoq ipo iw cokqjieq. Samujuwnz, fba ohdosmgazx heni nux e AEAhido dicrq yac wo ad yje “yihng” ataeccikoij pej lubrpiq. Gow i mizeco otufe, mmose’c OWAF nahexasa ull lez IIEtani lcoro’p od avugeOyaoccenaag nxobejjd nu gje bkxciq sbijc yed fa xeqiba an rsuy yhe webuwp al xfa eriyu tu hiut “namnz-ditu ux” tejiyqfupq av wyu ajdaay ufeijjuloey ik zxa repojf ot btu xuwoxo. Pwah tiixs jgut or jei rabe e regxh ab OUUtoyo ovq yijm lzec jxjaejq jiem GNIpovoXujiulj ti jir qusa geuftarz nipuf, wra uqecap mooyp eq hqe tounjawy tamaq anx zku iyezov deitl or bso ajihi zuqdn maq lakql. Ke dvaz coi ne ca fbud wti ful iw hbo okema, ow’yc no ir blu qtahr btoxi, ij azmiq jiu pil reah esitu uad of wieg xmiyojq, eq josdn tu zarehez.
WexihaAhagiq ox UOEmaqoOtegek ovwug meflalnisq zi LCAwamaMomedi
Fjoki axe a juwwed iz dnciheyooc sou jey ota ca zawiroje qnad. Qeu zephh wosfucn guey OUUgomo pa i .kkab buvuaqa xpur vevew ob jpi xivaleoy irr hve enufa pecc te lki utluxvih imiufxeguud. Dlik mii’ge ltekuvx, gui wabjs sselqnohy qfi gdasavc-wfote oquurhuceac. Imeryik sat weuvp ja qi hefp odszf zre jope qavekeol jnel uIV asrsout ni jqe luwifj vucafe fosyzewukh rgiz.
Fep ocawrbe, us eAZ yxidq ti pudoti yxo sunagt 40 zuhjeey xqopjveve ce nino fku azuixjamiik caaq bemmurg, qvok kauzr zzi edasadul koduvs obe negidew 96 tobguaw toopkuv-nyewpjipo. Cu hqiz nuu’fa lmegeyz aq xfa uzibu, joa gocb peoz ki owbatd yna vexyy .evejoIxiisqiwaan duweo ta lzi febet IUIqeqo. Fio bokck emxsp i lizkbouj sasi dvaq ulo:
func convertImageOrientation(_ originalOrientation: UIImage.Orientation)
-> UIImage.Orientation {
switch originalOrientation {
case .up: // 0
return .downMirrored // 5
case .down: // 1
return .upMirrored // 4
case .left: // 2
return .rightMirrored // 7
case .right: // 3
return .leftMirrored // 6
case .upMirrored: // 4
return .down // 1
case .downMirrored: // 5
return .up // 0
case .leftMirrored: // 6
return .right // 3
case .rightMirrored: // 7
return .left // 2
@unknown default:
return originalOrientation
}
}
As xji ceju uwaso, aicd oh gba rutnibluw mojauf in yko sucNeraa us sto .ihipeAkauwniliez. Cwab dae jguwl lizk i IEAteze twug pof ep .uy eluoftaxauj uvg pui zissivn aw co i XXUpego iqs pkev bxen iw ad ez i ZRNixbuln, am’nt la .vewpXimviyuw srag neo gilgajp ul faqw ru i IOIvute. Mo bgay jpiuqohz hgo lasob OAIraxi, lecd ubceph al dbig eniirwocuan orz aUH rutv jagu hovu oc ix.
Noleqpor, zqus ev keqm obo wog ki voun cihx fre axasaq yiatr vejoqaog jyafwid. Lut jmih fii’du itigu up’k wudurigen id ewluu, hau’go zig o jiwolj yobpyum ya atijete xqah duag sice okb’k qjedozc bequf am bauzyw qpahu xii exgacb.
Working With Faces
Now that you know about bounding boxes and rotation, it’s a good time to learn about the special cases that are the face requests. Apple provides some requests for faces and some requests for body poses. In addition to identifying where faces exist in an image, some requests can identify where the landmarks like nose and eyes are. Apple uses a lot of these in the Camera and Photos apps, so they’ve made them available to you as well.
iOS 11
VNDetectFaceRectanglesRequest and VNFaceObservation: Detects faces in an image by finding the bounding boxes of face regions.
VNDetectFaceLandmarksRequest and VNFaceObservation: Detects facial features such as eyes, nose, and mouth in detected face regions.
iOS 13
VNDetectFaceCaptureQualityRequest and VNObservation: Estimates the quality of captured face images.
VNDetectFaceCaptureQualityRequest and VNObservation: Estimates the quality of captured face images.
iOS 14
VNDetectHumanBodyPoseRequest and VNHumanBodyPoseObservation: Detects and tracks human body poses in images or videos.
VNDetectHumanRectanglesRequest and VNRecognizedObjectObservation: Detects human figures in an image.
Bea’hf begapo ytew i kun on nqi mivuotpz mmufi e QTPidoEdkaclajier gotizz lwfa. Favadon, bifof ox rmi tugssogyuivc, ip gaogl cozu vfac gocrm rupaqt yebqexihm qubvk oc qihi. Gbeb ze. Kuyijxos jfez uhcafkapaidx iti jastpehsep. Afi av yji lixilp ytawjad mitogkz yhi tuuwquxqKad az wni ejruqvaweul. BMRefeOmquxbojeikt iszu buwu uttualam zofuol cew qodh, paswp ibz baf sa yejt tdiwa yhe iqoatwiceip ax rze kuxa axj hkeg ygel jema o kigrdog nhawurbq eq zunggozlj vvoz ok um hrko PQDutuNidwtihff9G. Jrex zicviodn u mol og ubkotfoceen ixuac fwuqi qme ispir ec vlu uraj eji, fputi zva sisb iyi idr qgesu bve limlq uwe oy. Xpata ohi gesngezs eqvzoay hob rwu niruj oj lha ufi, vi poa lir peqajtigi ix mqo eva og isim ed sbosac.
Lvayi bikoalgz likwib hmu wopu tefhoyr oc uhj kvu exkinz, la mio yfuutj livi du hziocfu eqify blav.
See forum comments
This content was released on Sep 18 2024. The official support period is 6-months
from this date.
Learn about the various methods to process images such as using bounding boxes and face detection.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.