Predictive Analysis: The Sinking Of The Titanic

1115 Words5 Pages
The sinking of the RMS Titanic caused the death of thousands of passengers and crew is one of the deadliest maritime disasters in history. One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there were some elements of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class. The objective is to apply different machine learning models to complete the analysis of what sorts of people were likely to survive. The result of applying machine learning algorithms are compared and analysed on the basis of accuracy.
Keywords- Titanic, Logistic Regression, Random
…show more content…
The iceberg collision ripped open Titanic’s hull in several places. Titanic carried thousands of people of all ages, genders and class that fateful night, and only a few hundred escaped in lifeboats and rest died in the icy water. The dead included a large number of men whose place was given to the many women and children on board. The dead primarily consisted of men in the ship’s second class.
Machine learning techniques are applied to predict which passengers survived the sinking of the Titanic. Features like ticket fare, age, sex, class will be used to make the predictions. Predictive analysis is a procedure that incorporates the use of computational methods to determine important and useful patterns in large data. Using the machine learning algorithms, survival is predicted on different combinations of features.
The objective is to perform exploratory data analytics to mine various information in the dataset available at kaggle and to know effect of each field on survival of passengers by applying analytics between every field of dataset with “Survival” field. The prediction the output for newer data sets by applying machine learning algorithm is done. The data analysis will be done on applied algorithms and accuracy will be checked. The different algorithms are compared on the basis of accuracy and the best performing model is suggested with respect to used dataset.
…show more content…
As the name suggest, this algorithm creates the forest with a number of trees. The higher the number of trees in the forest gives the higher accuracy results. Random forest algorithm can be used for both classification and regression problems. For instance, it will take random samples of 100 observation and 5 randomly chosen initial variables to build a model. It will repeat the process (say) 10 times and then make a final prediction on each observation. Final prediction is a function (mean) of each prediction
Decision Tree
Decision tree is a type of supervised learning algorithm which is generally used in classification problems. It is suitable for both categorical and continuous input and output variables. Each root node represents a single input variable (x) and a split point on that variable. The leaf nodes of the tree contain an output variable (y) which is used to make a prediction. For example: Given a dataset with two inputs (x) of height in centimetres and weight in kilograms, the output of sex as male or female (hypothetical example, for demonstration purpose only.) There are two types of decision tree based on the type of target

More about Predictive Analysis: The Sinking Of The Titanic

Open Document