Skip to content

Quantum-Software-Development/Integrated_Project-Business

Repository files navigation


Integrated Project for Business



$$\Huge {\textbf{\color{DodgerBlue} GOOD DECISIONS = GOOD RESULTS}}$$



Sponsor Quantum Software Development


t describes the process of scaling or normalizing data within a specific range, typically to a standardized scale, for example, from 0 to 1. This is a common technique in data analysis and machine learning.



$$X_{normalized} = \frac{X - X_{\min}}{X_{\max} - X_{\min}}$$


$$X_{\max} - X_{\min} = \text{Amplitude}$$


Is the amplitude, a way to represent the range of data values before normalization.


To calculate the standardization of the variables salario, n_filhos, and idade using both the Z-Score and Range methods, and to evaluate the mean, standard deviation, maximum, and minimum before and after standardization, we can follow these steps:

Compute the mean, standard deviation, maximum, and minimum for each of the variables (n_filhos, salario, idade).

We standardize the variables using the Z-Score method, which is computed as:

$Z$ = $\frac{X - \mu}{\sigma}$

Z = \frac{X - \mu}{\sigma}

Where:

  • $( \mu )$ is the mean,

  • $( \sigma )$ is the standard deviation.


We scale the data using the Min-Max method, which scales the values to a [0, 1] range using:

$X'$ = $\frac{X - \min(X)}{\max(X) - \min(X)}$

X' = \frac{X - \min(X)}{\max(X) - \min(X)}

Where:

  • X is the original value,
  • min(X) is the minimum value,
  • max(X) is the maximum value.

Compute the mean, standard deviation, maximum, and minimum of the standardized data for both Z-Score and Range methods.

The output will provide the descriptive statistics before and after each standardization method, allowing you to compare the effects of Z-Score and Range standardization on the dataset.


Practical Example for Calculating this Normalized Value in Python:

Use this dataset

The code demonstrates how to apply Z-Score and Range (Min-Max) standardization to the variables salario, n_filhos, and idade in a dataset. It also evaluates and compares the mean, standard deviation, minimum, and maximum values before and after the standardization methods are applied.


# Importing the necessary libraries
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

# Load the data from the Excel file
# df = pd.read_excel('use-your-own-dataset.xlsx') - optional
df = pd.read_excel('cadastro_funcionarios.xlsx')
df.head()  # Displaying the first few rows of the dataset to understand its structure

# Step 1: Evaluate the mean, std, max, and min before standardization
before_std_stats = {
    'mean_n_filhos': df['n_filhos'].mean(),
    'std_n_filhos': df['n_filhos'].std(),
    'min_n_filhos': df['n_filhos'].min(),
    'max_n_filhos': df['n_filhos'].max(),
    
    'mean_salario': df['salario'].mean(),
    'std_salario': df['salario'].std(),
    'min_salario': df['salario'].min(),
    'max_salario': df['salario'].max(),
    
    'mean_idade': df['idade'].mean(),
    'std_idade': df['idade'].std(),
    'min_idade': df['idade'].min(),
    'max_idade': df['idade'].max(),
}

# Display the statistics before standardization
before_std_stats

Cell 4: Apply Z-Score standardization

# Step 2: Z-Score Standardization
df_zscore = df[['n_filhos', 'salario', 'idade']].apply(lambda x: (x - x.mean()) / x.std())

# Display the standardized data
df_zscore.head()

Cell 5: Evaluate the statistics after Z-Score standardization

# Step 3: Evaluate the mean, std, max, and min after Z-Score standardization
after_zscore_stats = {
    'mean_n_filhos_zscore': df_zscore['n_filhos'].mean(),
    'std_n_filhos_zscore': df_zscore['n_filhos'].std(),
    'min_n_filhos_zscore': df_zscore['n_filhos'].min(),
    'max_n_filhos_zscore': df_zscore['n_filhos'].max(),
    
    'mean_salario_zscore': df_zscore['salario'].mean(),
    'std_salario_zscore': df_zscore['salario'].std(),
    'min_salario_zscore': df_zscore['salario'].min(),
    'max_salario_zscore': df_zscore['salario'].max(),
    
    'mean_idade_zscore': df_zscore['idade'].mean(),
    'std_idade_zscore': df_zscore['idade'].std(),
    'min_idade_zscore': df_zscore['idade'].min(),
    'max_idade_zscore': df_zscore['idade'].max(),
}

# Display the statistics after Z-Score standardization
after_zscore_stats

Cell 6: Apply Range Standardization (Min-Max Scaling)

# Step 4: Range Standardization (Min-Max Scaling)
scaler = MinMaxScaler()
df_range = pd.DataFrame(scaler.fit_transform(df[['n_filhos', 'salario', 'idade']]), columns=['n_filhos', 'salario', 'idade'])

# Display the scaled data
df_range.head()

Cell 7: Evaluate the statistics after Range standardization

# Step 5: Evaluate the mean, std, max, and min after Range standardization
after_range_stats = {
    'mean_n_filhos_range': df_range['n_filhos'].mean(),
    'std_n_filhos_range': df_range['n_filhos'].std(),
    'min_n_filhos_range': df_range['n_filhos'].min(),
    'max_n_filhos_range': df_range['n_filhos'].max(),
    
    'mean_salario_range': df_range['salario'].mean(),
    'std_salario_range': df_range['salario'].std(),
    'min_salario_range': df_range['salario'].min(),
    'max_salario_range': df_range['salario'].max(),
    
    'mean_idade_range': df_range['idade'].mean(),
    'std_idade_range': df_range['idade'].std(),
    'min_idade_range': df_range['idade'].min(),
    'max_idade_range': df_range['idade'].max(),
}

# Display the statistics after Range standardization
after_range_stats


Pratical Example for Calculating this Normalized Value in Excel

Use this dataset

To standardize the variables (salary, number of children, and age) in Excel using the Z-Score and Range methods, you can follow these steps:


Steps for Z-Score in Excel:

Use the AVERAGE function to calculate the mean of the column. For example, to find the mean of the salary (column E), use:

=AVERAGE(E2:E351)

Use the STDEV.P function to calculate the standard deviation of the column. For example, to find the standard deviation of the salary (column E), use:

=STDEV.P(E2:E351)

For each value in the column, apply the Z-Score formula. In the first row of the new column, use:

=(E2 - AVERAGE(E$2:E$351)) / STDEV.P(E$2:E$351)

Example for Salary:

In cell H2 (new column for standardized salary), write

=(E2 - AVERAGE(E$2:E$351)) / STDEV.P(E$2:E$351)

Then, drag it down to the rest of the rows.

Repeat the same steps for the variables n_filhos (column D) and idade (column F).


Steps for Range Standardization in Excel:

Use the MIN and MAX functions to find the minimum and maximum values of the column. For example, to find the min and max of salary (column E), use:

=MIN(E2:E351)
=MAX(E2:E351)

For each value in the column, apply the range formula. In the first row of the new column, use:

=(E2 - MIN(E$2:E$351)) / (MAX(E$2:E$351) - MIN(E$2:E$351))

Example for Salary:

In cell I2 (new column for range standardized salary), write:

=(E2 - MIN(E$2:E$351)) / (MAX(E$2:E$351) - MIN(E$2:E$351))

Then, drag it down to the rest of the rows. Repeat the same steps for the variables n_filhos (column D) and idade (column F).


Summary of the Process

Z-Score Standardization centers the data around zero and scales it based on the standard deviation.

Range Standardization (Min-Max Scaling) rescals the data to a [0, 1] range.

Both techniques were applied (given dataset) to the columns n_filhos, salario, and idade, and the statistics (mean, std, min, max) were calculated before and after the standardization methods.



Copyright 2025 Quantum Software Development. Code released under the MIT License license.

About

Integrated Project for Business

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Sponsor this project