Cancer is a condition when a few of the body's cells grow out of control and spread to other bodily regions.
In the millions of cells that make up the human body, cancer can develop practically anywhere.
Human cells often divide (via a process known as cell growth and multiplication) to create new cells as the body requires them. New cells replace old ones when they die as a result of ageing or damage.
Occasionally, this systematic process fails, causing damaged or abnormal cells to reproduce when they shouldn't. Tumors, which are tissue masses, can develop from these cells.
In this analysis, a thorough investigation of a cancer data collection is conducted. A model has been developed to predict cancer in order to more accurately determine if a person has the disease or not.
I looked over the dataset for any missing values after collecting the data.
I made the decision to divide my analysis into three sections during the analysis inquiry: Anxiety, Overall breakdown, and Breakdown based on Drinking and Smoking.
I chose to keep my attention just on these three because drinking, smoking, and anxiety have powerful interconnections.
Later, I used the KNN algorithm to develop a model. Since my dataset was relatively small and KNN performs best with few tuples (records), I chose to use it because it allowed me to predict with a high degree of accuracy.
The chart shows that there is not much of a difference between men and women who have cancer; both groups are close together.
Cause of this behaviour:
The graph shows that the age group 60 - 70 comprises the greatest number of persons, followed by those present in 50 - 60 and then those in 70 - 80.
According to the dataset, anxiety is directly correlated with the population size of a given age group, with persons in the 60 - 70 age range experiencing the most anxiety.
Two key conclusions may be drawn from the graph:
Model Code:
X = data.iloc[:,2:15]##independent features
y = data.iloc[:,15]##dependent features
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=12)
#feature Scaling
X_train = preprocessing.StandardScaler().fit(X_train).transform(X_train)
X_test = preprocessing.StandardScaler().fit(X_test).transform(X_test)
#Model Creation
classifier = KNeighborsClassifier(n_neighbors=8, weights = 'uniform', metric='euclidean').fit(X_train,y_train)
Accuracy of Model: 91.93548387096774 %
F-Score of Model: 95.49549549549549 %
Author's Note:
There are numerous strategies to either avoid cancer or lessen the harm it causes.
The greatest method is to maintain good health by putting an emphasis on healthy behaviours and engage in regular exercise.
A healthy body serves as a barrier to prevent illness.
The more we practise these activities, the less likely we are to get diseases, enabling us to live longer without stress.
Being optimistic also plays a crucial part in our health.