Assigment 4
eyefish
Xformed01 K 1.3 (output image, corrected
distortion)(before filter)
Assigment 4.1
1. Passport Photo
Use a photo of a
person against a white background and generate a US ( or your own
country ) passport
photo using the normalization and standardization techniques
introduced in this
module.
You can find the
detailed requirements here
https://travel.state.gov/content/passports/en/passports/photos.html
Things to bear in
mind
a. What should be
be the resolution in pixels for a 2 inch x 2 inch photo if for a
reasonable photo
quality print you need 300 DPI. What is DPI ? Google it.
b. In the
specification, it says head must be “between 1 -1 3/8 inches (25 - 35 mm)
from the bottom of
the chin to the top of the head.” We don’t have this information
using Dlib’s
landmarks. Can we approximate it by normalizing based on the
distance between
the eyes instead? How would you test if your approximation is
correct?
c. If you want to
get fancy, you can check if it the input photo is a good enough
photo for creating
a passport photo. This is not an easy task, and all the checks
below are not covered
in the course, but you can check
i. If face is
detected.
ii. Resolution of
the cropped face.
iii. Person is
looking straight at the camera.
iv. [ Advanced
users] : If the image is sufficiently well lit . The problem sounds
easier than it is.
The skin pixels are bad for estimating lighting. Can you
use the
background? How about the whites of the eyes?
v. [ Advanced
users ] : Check if the image is blurry.
vi. [ Advanced
users } : Check for noise levels.
Answer to assignment 4.1
a.-We say than 300 ppi == 300 dpi. For a 2 x 2 inches
photo and 300 dpi, the resolution will be 600 x 600 dots or 600 x 600 pixels on
the screen.
b.1.- The specification says “head must be between 1
and 1 3/8 inches means with 600 x 600 resolution, head must be between 300 and
412 dots.
b.2.- The specification says “Eye line must be from 1
1/8 to 1 3/8 inches from the bottom of the photo, that means from 337 to 412
pixels from the bottom. (as we have our origen in the corner upper left the
y-coordinate between 188 and 263. We can approximate a head saying that a 8
eyes head is 400 dots, one eye will be 50 dots in our photo, and the minimum
head is 300/50 = 6 eyes. Normally a head is between 6 and 7 eyes height, this way we have a eye room on the upper side.
b.3.- Usually the line of the eyes divide the face in
two halves. We can test with Dlib that the distance between the line of the
eyes to the tip of the chin must be lower than 412/2=206 dots.
c.- Checking list
c.1.- Is face detected? Use Dlib to detect face.
c.2.- Resolution of the cropped face? 5 times eye x 2
times distance in b.3
c.3.- Person is looking straight at the camera? Find
position of points 68 and 69 in landmark of 70 points.
c.4.- If the image is sufficiently well lit. histogram
of the white of the eyes.
c.5.- Check if the image is blurry.
c.6.- Check for noise levels.
I did a program that compute the 70 landmarks for a
photo, the program crop the photo according to the regulation for US passports,
the program add some white strip to center the photo as asked.
The program check if the person in the photo is
looking stright to the camera or not.
Next are some images with this results, I used several image sizes, but the output image on the right is always 600 x 600 pixels :
looking straight to the camera value = 0.013228
looking straight to the camera value = 0.00423745
looking straight to the camera value = 0.0325 greater than my empiric limit of 0.02
looking straight to the camera value = 0.000971146
looking straight to the camera value = 0.00414865
Assigment 4.2
Blink detection
a. Use a different measure for finding the status of the eye.
b. Use a different method for normalizing the eye area and check its robustness.
I computed several distances:
1.- The distance between landmark(38) and landmark(40) (between eyelids of the right eye)
2.- The distance between landmark(44) and landmark(46) (between eyelids of the left eye)
3.- The distance between landmark(36) and landmark(45) (between the outer corners of the eyes), that happen to be aproximately 3 times the distance between the corners of one eye.
float factor=3.0;
normalizedCount = (float)factor*(lenEyelidsLeft + lenEyelidsRight) / lenOutEyesCorners;
Explanation:
In the Module 4.6 Blink and Drowsiness
Detection is calculated the area of the eyes and divided by the squared length of the eye to normalize that measure.
The area of the eye is aproximately half of the area of the bounding box of one eye. We added the two distances of the eyelids left and right and multiply imaginarily by the length of the eye.
In the denominator we have the length outer of the eyes multiply imaginarily by the length of the eye.
we simplify in our head the same factor in numerator and denominator.
The expression is three times lower because the outer distance of the eyes is three times one eye, this explains the factor=3.0.
It works and I don't need to tweak other parameters.
Assigment 4.3
3. Funny faces
Take an image having fisheye distortion
and try to correct it by inverting the distortion.
You
can choose the parameters manually using sliders.
Image
fisheye_donald.jpg
My answer:
I
started with the equations:
I
got interesting outputs:
grid
default values
grid
pincushion distortion parametro .0000036
grid
barrel distortion parametro -0000018
eyefish
parametro 0000061b
That
was near inside the red rectangle, but, it does not solve the problem.
Next,
I tried the trigonometric function used in module 4.7 and that did the trick
with a k=1.3
//
Pincushion distortion function
float rn = std::min((double)r, r + (pow(r, k) - r) * cos(M_PI * r));
eyefish Xformed01 K 1 default
parameter (input image)
eyefish
Xformed01 K 1.3 (output image, corrected
distortion)
There
are some elliptic artifacts because the transformation is not biyective and I
can’t clear easily the transformation equation for xu and yu. Lets try to improve
it.
I
know that the inverse transformation must be continue, it is represented by the
arrays that I created: IXu(y,x) and IYu(y,x), so I filter these arrays with a
medianBlur() to fill the gap. The output images are below:
eyefish
Xformed01 K 1.0 default parameter + MedianBlur
(input image without distorsion corrected)
eyefish
Xformed01 K 1.3 MedianBlur (elliptic
artifacts disapear) (output image with distorsion corrected)
quadCircle
Xformed01 K 1.0 default value MedianBlur
(this image has the same distorsion as eyefish donald_trump) (input image
without distorsion corrected)
quadCircle
Xformed01 K 1.3 MedianBlur (output image
with distorsion corrected)