In this project, I explore photometric redshift (photo-z) estimation using machine learning, with data drawn from the Main Galaxy Sample (MGS) in the Sloan Digital Sky Survey (SDSS) Data Release 18. The focus is on improving photo-z estimation accuracy by experimenting with different input parameters and configurations.
Key research questions addressed in this analysis:
- Effect of Band Reduction: How does the accuracy of photometric redshift estimation change when the number of photometric bands is reduced?
- Inclusion of Band Magnitude Error: What is the impact on photo-z accuracy when band magnitude errors are included in the model input?
- Band Contribution: Which photometric band (among U, G, R, I, Z) degrades the accuracy of photo-z estimation the most?
- Inclusion of Morphological Parameters: How does adding morphological information improve or affect the accuracy of photo-z estimation?
The data used in this project is taken from the SDSS Data Release 18 (DR18). Specifically, the photometric measurements and corresponding morphological parameters of galaxies in the Main Galaxy Sample (MGS) are used as input for training and evaluating machine learning models.
For the estimation of photometric redshifts, I have used the RandomForestRegressor from the scikit-learn library, with its default configuration:
sklearn.ensemble.RandomForestRegressor(
n_estimators=100,
criterion='squared_error',
max_depth=None,
min_samples_split=2,
min_samples_leaf=1,
max_features=1.0,
bootstrap=True
)
- To improve the accuracy of photometric redshift estimation by optimizing the input features.
- To identify which photometric bands and parameters most significantly affect the estimation process.