Visión Artificial y Grafismos: Module 4

Assigment 4

eyefish Xformed01 K 1.3 (output image, corrected distortion)(before filter)

Assigment 4.1

1. Passport Photo

Use a photo of a person against a white background and generate a US ( or your own

country ) passport photo using the normalization and standardization techniques

introduced in this module.

You can find the detailed requirements here

https://travel.state.gov/content/passports/en/passports/photos.html

Things to bear in mind

a. What should be be the resolution in pixels for a 2 inch x 2 inch photo if for a

reasonable photo quality print you need 300 DPI. What is DPI ? Google it.

b. In the specification, it says head must be “between 1 -1 3/8 inches (25 - 35 mm)

from the bottom of the chin to the top of the head.” We don’t have this information

using Dlib’s landmarks. Can we approximate it by normalizing based on the

distance between the eyes instead? How would you test if your approximation is

correct?

c. If you want to get fancy, you can check if it the input photo is a good enough

photo for creating a passport photo. This is not an easy task, and all the checks

below are not covered in the course, but you can check

i. If face is detected.

ii. Resolution of the cropped face.

iii. Person is looking straight at the camera.

iv. [ Advanced users] : If the image is sufficiently well lit . The problem sounds

easier than it is. The skin pixels are bad for estimating lighting. Can you

use the background? How about the whites of the eyes?

v. [ Advanced users ] : Check if the image is blurry.

vi. [ Advanced users } : Check for noise levels.

Answer to assignment 4.1

a.-We say than 300 ppi == 300 dpi. For a 2 x 2 inches photo and 300 dpi, the resolution will be 600 x 600 dots or 600 x 600 pixels on the screen.

b.1.- The specification says “head must be between 1 and 1 3/8 inches means with 600 x 600 resolution, head must be between 300 and 412 dots.

b.2.- The specification says “Eye line must be from 1 1/8 to 1 3/8 inches from the bottom of the photo, that means from 337 to 412 pixels from the bottom. (as we have our origen in the corner upper left the y-coordinate between 188 and 263. We can approximate a head saying that a 8 eyes head is 400 dots, one eye will be 50 dots in our photo, and the minimum head is 300/50 = 6 eyes. Normally a head is between 6 and 7 eyes height, this way we have a eye room on the upper side.

b.3.- Usually the line of the eyes divide the face in two halves. We can test with Dlib that the distance between the line of the eyes to the tip of the chin must be lower than 412/2=206 dots.

c.- Checking list

c.1.- Is face detected? Use Dlib to detect face.

c.2.- Resolution of the cropped face? 5 times eye x 2 times distance in b.3

c.3.- Person is looking straight at the camera? Find position of points 68 and 69 in landmark of 70 points.

c.4.- If the image is sufficiently well lit. histogram of the white of the eyes.

c.5.- Check if the image is blurry.

c.6.- Check for noise levels.

I did a program that compute the 70 landmarks for a photo, the program crop the photo according to the regulation for US passports, the program add some white strip to center the photo as asked.

The program check if the person in the photo is looking stright to the camera or not.

Next are some images with this results, I used several image sizes, but the output image on the right is always 600 x 600 pixels :

looking straight to the camera value = 0.013228

looking straight to the camera value = 0.00423745

looking straight to the camera value = 0.0325 greater than my empiric limit of 0.02

looking straight to the camera value = 0.000971146

looking straight to the camera value = 0.00414865

Assigment 4.2

Blink detection

a. Use a different measure for finding the status of the eye.
b. Use a different method for normalizing the eye area and check its robustness.

I computed several distances:

1.- The distance between landmark(38) and landmark(40) (between eyelids of the right eye)

2.- The distance between landmark(44) and landmark(46) (between eyelids of the left eye)

3.- The distance between landmark(36) and landmark(45) (between the outer corners of the eyes), that happen to be aproximately 3 times the distance between the corners of one eye.

float factor=3.0;

normalizedCount = (float)factor*(lenEyelidsLeft + lenEyelidsRight) / lenOutEyesCorners;

Explanation:

In the Module 4.6 Blink and Drowsiness Detection is calculated the area of the eyes and divided by the squared length of the eye to normalize that measure.

The area of the eye is aproximately half of the area of the bounding box of one eye. We added the two distances of the eyelids left and right and multiply imaginarily by the length of the eye.
In the denominator we have the length outer of the eyes multiply imaginarily by the length of the eye.
we simplify in our head the same factor in numerator and denominator.
The expression is three times lower because the outer distance of the eyes is three times one eye, this explains the factor=3.0.

It works and I don't need to tweak other parameters.

Assigment 4.3

3. Funny faces

Take an image having fisheye distortion and try to correct it by inverting the distortion.

You can choose the parameters manually using sliders.

Image fisheye_donald.jpg

My answer:

I started with the equations:

I got interesting outputs:

grid default values

grid pincushion distortion parametro .0000036

grid barrel distortion parametro -0000018

eyefish parametro 0000061b

That was near inside the red rectangle, but, it does not solve the problem.

Next, I tried the trigonometric function used in module 4.7 and that did the trick with a k=1.3

// Pincushion distortion function

float rn = std::min((double)r, r + (pow(r, k) - r) * cos(M_PI * r));

eyefish Xformed01 K 1 default parameter (input image)

eyefish Xformed01 K 1.3 (output image, corrected distortion)

There are some elliptic artifacts because the transformation is not biyective and I can’t clear easily the transformation equation for xu and yu. Lets try to improve it.

I know that the inverse transformation must be continue, it is represented by the arrays that I created: IXu(y,x) and IYu(y,x), so I filter these arrays with a medianBlur() to fill the gap. The output images are below:

eyefish Xformed01 K 1.0 default parameter + MedianBlur (input image without distorsion corrected)

eyefish Xformed01 K 1.3 MedianBlur (elliptic artifacts disapear) (output image with distorsion corrected)

quadCircle Xformed01 K 1.0 default value MedianBlur (this image has the same distorsion as eyefish donald_trump) (input image without distorsion corrected)

quadCircle Xformed01 K 1.3 MedianBlur (output image with distorsion corrected)

Visión Artificial y Grafismos

domingo, 23 de julio de 2017

Module 4 - Assignments

No hay comentarios:

Publicar un comentario