Data Pre-processing
This notebook takes raw photometric (energy amounts at different wavelengths of the visible spectrum) from stellar objects and prepares it for analysis in different Gaussian mixture clustering models
- raw data is crossmatched for common objects with data from a catalog of standard (non-variable), stacked stars
- the data is handled in Spark dataframes, then converted to numpy arrays and saved for analysis in GMM_plots notebook
- a plot of the color distributions for the catalog stars is generated at end