The goals / steps of this project are the following:
- Compute the camera calibration matrix and distortion coefficients given a set of chessboard images.
- Apply a distortion correction to raw images.
- Use color transforms, gradients, etc., to create a thresholded binary image.
- Apply a perspective transform to rectify binary image ("birds-eye view").
- Detect lane pixels and fit to find the lane boundary.
- Determine the curvature of the lane and vehicle position with respect to center.
- Warp the detected lane boundaries back onto the original image.
- Output visual display of the lane boundaries and numerical estimation of lane curvature and vehicle position.
Rubric Points
Here I will consider the rubric points individually and describe how I addressed each point in my implementation.
You're reading it! The entire project logic is contained in lane_finder.py. In this file, I have calculated the calibration matrix, then proceeded to create a sequence of functions for each needed component of the process. At the end of the file, lines #385 - #442 contains a function that contains all the required function in a logical order. Lines #456 - #464 calls the function for files Lines #467 - #474 calls the function for videos
1. Briefly state how you computed the camera matrix and distortion coefficients. Provide an example of a distortion corrected calibration image.
Using the images designed for the calibration, lines #47 - #81 read in and process the images. This is done by converting to grayscale (line #51), and then applying the findChessboardCorners function to the image (line #56). Assuming the chessboard corners are found, we can create object points and image points, and use these to both calibrate the camera lens (line #78 to call calibrate camera on #29)
The distortion-correction is applied to the image during the thresholding function as described in the new step. Essentially an image is read in, and passed to the thresholding function. During this phase, the image is converted to gray scale (line #116), and the objpoints and imgpoints calculated in the previous section are used to undistort the image (line #117). This is then used for further processing. However, there is little impact between the results generated by undistorted versus distorted images, most likely due to the use of a bounding box to generate areas of interest.
2. Describe how (and identify where in your code) you used color transforms, gradients or other methods to create a thresholded binary image. Provide an example of a binary image result.
The thresholding function is broken down into sections.
-
- Converting the original image to HLS space (lines #99 - #107)
-
- Converting the original image to YUV space (lines #112 - #114)
-
- Converting the YUV image to grayscale (lines #120 - #121)
-
- Calculating Absolute X and Y Sobel values (lines #137 - #148)
-
- Calculating the Magnitude Sobel values (lines #152 - #163)
-
- Calculating the Directional Sobel values (lines #167 - #172)
-
- Output the thresholded image (lines #176 - #189)
For each calculation, the final image was also 'Morphological Transformed', using MORPH-CLOSE. This dilated the image to combine the sparse points, followed by eroding, to return the image to approximately the correct size. A kernel of 3x3 was used to ensure the image didn't overbloat.
The rule of which thresholding to use was either use where abs values of X and Y overlapped (step 4), or where the Magnitude and Directional values overlapped. Presumably it will always take the absolute values overlapping over a potentially better direction/magnitude combo, due to the ordering. Some work could be done to ascertain which is better.
A specific timesink for me was that some images, and previous examples in the notes had scaled thresholds = 1, whereas some of my images were 0-1 scale and some were 0-255.
As such, some of my binarisations needed to be =1
and some were =255
. This is a quick fix approach. Going forward, some work will be done to ensure all images are similarly scaled.
Thresholding: The order of the thresholding calculation is worth exploring and determining which combination of thresholds provides the most information gained. Eg, sometimes it appears an absolute X value with a Magnitude calulation captured a lot of the information required. Optimising code to take the correct ones could be improved. Scaling: Ensuring the images are all scaled correctly. Matplotlit and OpenCV also read files in different formats, so ensuring a standardisation process for all images would be useful. Dynamic: Perhaps exploring some dynamic iteration of thresholds used to determine the most valid pixels of interest. This is potentially useful, when the colours change rapidly in an image (shadows, changing between road surfaces) which my algorithm struggled with (minorly for the project image, quite drastically for the hard challenge) Adding in the morphology made the algorithm extremely slow, so further work will need to be done to optimise the speed.
3. Describe how (and identify where in your code) you performed a perspective transform and provide an example of a transformed image.
For the perspective transform, I took the bounding box where I presumed lines would mostly be (in front of the car for a distance almost to the horizon) and performed my perspective transform specifically on this bounding box. This transform is captured in the calculate_M function (lines #221 - #225) and using that M, is called in the following functions corner_unwarp and corner_warp. These are called to change to a birdseye view, and back to perspective view respectively.
Bounding Box: The bounding box approach I used is accurate for gently curving and straight roads. However this needs further work for sharp cornering and more dynamic surfaces. Perspective of the Bounding Box: This may become less valid if the size of the bounding box is dramatically increased. Also, focussing only on the immediate area ignores the complexities around engaging traffic, which will be a concern when the bounding box increases or disappears.
From lines #234 to #311 the calculations for the lane lines occurs. Using windowing histograms on the warped thresholded image enabled me to calculate the location of the most common pixels in the image. This only works if the thresholding applied previously only (as much as possible) returns the items of interest. Therefore if there is a lot of noise in the thresholding image, or very little detail, the histogram detection is less accurate. In this instance the number of windows chosen was arbitrarily 15, and this seemed to work. When the thresholding function performs poorly more pixels are needed to produce an accurate measure (minpix = 40) and the windows can be made larger to take in more pixels (margin = 180)
for window in range(nwindows) :
enables us to iterate through the windows, and use the previous window as a starting point for the next, to reduce calculation times.
The calculation is highly reliant on the accuracy of the thresholding and the results from that. The more accurate the threshold, the more strict the criterion for entering into the window. One possible scenario is to determine which line side is more accurate and offset the other side from the accurate side, to ensure you are capturing the correct pixels. This might require a 'confidence level' of which one is correct, but worth investigating.
The fit_polynomial function (lines #314 - #350) is used to take the output from the windowing and using numpy, fit a polynomial to the points. The function starts off with defining empty array as the left and right fit, which solved some issues, when there were no points recognised on one side or another and the fit function cannot run.
Adding more rigour around safely exiting when the polynomial doesn't predict is a good programming practise to add in later. Again, ensuring that the left and right lane lines are approximately equal in curvature would be useful to ensure that some drastic errors aren't occurring. Saving and persisting the lane lines and points, and the curvatures and using that from one from to the next would be quite useful and potentially reduce the need to wholely calculate the curvature each time. Another useful component might be to try to predict sections of road as distinct pieces. Knowing that you are driving on a straight bit, but some curvature is approaching in X metres, is more useful than fitting a spline to a section of road than has distinct phases.
5. Describe how (and identify where in your code) you calculated the radius of curvature of the lane and the position of the vehicle with respect to center.
The functions compute_curvature (lines #353 - #367) and compute_offset (lines #369 - #376) do what their titles say. Taking the fitted polynomial line as calculated in the previous steps, again using numpy polyfit, I calculated the curvature of the arc based on the number of pixels in the image. By extrapolating this curve out, and using the image pixel to metre conversion, I could approximate the curvature in metres of the arc of sections of road. (line #361 and #364) This obviously makes more sense when the lane is visibly curved but the algorith also calculates this on straight sections, as it attempts to fit a second order polynomial to an almost straight line segment.
Similar logic was used to calculate the offset, where the number of pixels was multiplied by a factor to determine the distance from the average of the left and right curved lane lines, bounding the vehicle.
Some sense checking of the curvature numbers from the previous frame would be useful, and something nicer to the eye than the raw numbers, as they change too rapidly with each frame to be of any real value. Perhaps a time-series calculation to get an average curature, etc, over x previous time iterations.
6. Provide an example image of your result plotted back down onto the road such that the lane area is identified clearly.
Post calculation of the lane curvatures, the fill_poly
function (lines #380 to #384) was used to fill in between those lines, thus creating a filled shape that should take up the entire lane in front of the vehicle.
1. Provide a link to your final video output. Your pipeline should perform reasonably well on the entire project video (wobbly lines are ok but no catastrophic failures that would cause the car to drive off the road!).
Here's a link to my video result
The algorithm provides well for all of the video. While the accuracy at the top of the polygon (i.e. further away from the car) skews offline, the main body of the polygon is an accurate representation of the lane from both a width and curvature perspective. The polygon fits reasonably well to the lane, and doesn't have any large disturbance that should affect driving.
1. Briefly discuss any problems / issues you faced in your implementation of this project. Where will your pipeline likely fail? What could you do to make it more robust?
These are discussed at the relevant sections. Overall the algorithm performed ok with the images, and medium-to-well on the project video. It did however perform quite poorly on the challenge videos due to a number of factors, which I have touched on above. The bounding box limits the amount of curvature than we can calculate as we travel. Some more work is needed around the thresholding, and different value potentially under different lighting conditions would be a useful piece of investigation.