Housing Price Prediction

Housing Price Prediction

·

2 min read

This project is based on Chapter 2 of the Hands-On Machine Learning book. The goal of the project is to build a housing price prediction model using the California housing dataset.

Project link: peeplika/Housing-price-prediction (github.com)

Step 1: Downloading the Data

The California housing dataset was used, which is a shortened version of the full dataset. This smaller dataset makes it easier to work with while still providing enough data for training the model.

Step 2: Splitting the Data

A portion of the data was set aside for testing before any analysis was performed. This ensures that the test data remains unseen during the model training process, providing a realistic evaluation of the model's performance.

Step 3: Visualizing the Data

To gain a better understanding of the dataset, visualizations such as histograms were created. These visualizations revealed the distribution of features, allowing potential patterns or outliers to be identified.

Step 4: Modifying the Data

The data was modified to enhance its quality. This included:

  • Adding new, relevant features such as rooms_per_household.

  • Handling missing values to avoid losing critical information.

  • Converting categorical features to numerical values using one-hot encoding.

Step 5: Trying Different Models

Several machine learning models were tested to determine which one performed best on the dataset. Testing multiple models ensures that the most suitable one is selected.

Step 6: Fine-Tuning the Model

After selecting the best-performing model, hyperparameter tuning was applied to optimize its performance. This step ensures that the model achieves the best possible results.

Step 7: Testing the Model

Finally, the model was tested on the previously unseen test data. Performance metrics were recorded to evaluate the model's ability to predict housing prices accurately.