Wednesday, July 26, 2023

Linear regression purely in Python

 Yes, we can implement a simple linear regression algorithm using only Python, without relying on any external libraries like scikit-learn. The key components of the algorithm involve calculating the slope (coefficients) and intercept of the line that best fits the data.


Here's a pure Python implementation of linear regression using the method of least squares:


```python

# Step 1: Load the data (Boston Housing dataset)

# For this example, let's use a simplified version of the dataset with one feature for simplicity.

# In a real-world scenario, you would load the data from a file or another source.

X = [2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]  # Input feature (e.g., number of rooms)

y = [3.0, 4.0, 2.5, 5.0, 6.0, 8.0, 7.5]  # Target variable (e.g., median house price)


# Step 2: Implement linear regression

def linear_regression(X, y):

    n = len(X)

    sum_x = sum(X)

    sum_y = sum(y)

    sum_xy = sum(x * y for x, y in zip(X, y))

    sum_x_squared = sum(x ** 2 for x in X)


    # Calculate the slope (coefficient) and intercept of the line

    slope = (n * sum_xy - sum_x * sum_y) / (n * sum_x_squared - sum_x ** 2)

    intercept = (sum_y - slope * sum_x) / n


    return slope, intercept


# Step 3: Fit the model and get the coefficients

slope, intercept = linear_regression(X, y)


# Step 4: Make predictions on new data

def predict(X, slope, intercept):

    return [slope * x + intercept for x in X]


# Step 5: Evaluate the model's performance

# For simplicity, let's calculate the mean squared error (MSE).

def mean_squared_error(y_true, y_pred):

    n = len(y_true)

    squared_errors = [(y_true[i] - y_pred[i]) ** 2 for i in range(n)]

    return sum(squared_errors) / n


# Make predictions on the training data

y_pred_train = predict(X, slope, intercept)


# Calculate the mean squared error of the predictions

mse_train = mean_squared_error(y, y_pred_train)


print(f"Slope (Coefficient): {slope:.4f}")

print(f"Intercept: {intercept:.4f}")

print(f"Mean Squared Error: {mse_train:.4f}")

```


Note that this is a simplified example using a small dataset. In a real-world scenario, you would load a larger dataset and perform additional preprocessing steps to prepare the data for the linear regression model. Additionally, scikit-learn and other libraries offer more efficient and optimized implementations of linear regression, so using them is recommended for practical applications. However, this pure Python implementation illustrates the fundamental concepts behind linear regression.

No comments:

Post a Comment

ASP.NET Core

 Certainly! Here are 10 advanced .NET Core interview questions covering various topics: 1. **ASP.NET Core Middleware Pipeline**: Explain the...