You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

1636 lines
147 KiB

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"kaggle泰坦尼克之灾"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"一.导入数据包与数据集"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import pandas as pd\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn as sns\n",
"import warnings\n",
"warnings.filterwarnings('ignore')"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"train=pd.read_csv(r'train.csv')\n",
"test=pd.read_csv(r'test.csv')\n",
"PassengerId=test['PassengerId']\n",
"all_data = pd.concat([train, test], ignore_index = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"二.数据分析"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1.总体预览"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>PassengerId</th>\n",
" <th>Survived</th>\n",
" <th>Pclass</th>\n",
" <th>Name</th>\n",
" <th>Sex</th>\n",
" <th>Age</th>\n",
" <th>SibSp</th>\n",
" <th>Parch</th>\n",
" <th>Ticket</th>\n",
" <th>Fare</th>\n",
" <th>Cabin</th>\n",
" <th>Embarked</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" <td>Braund, Mr. Owen Harris</td>\n",
" <td>male</td>\n",
" <td>22.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>A/5 21171</td>\n",
" <td>7.2500</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>2</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>Cumings, Mrs. John Bradley (Florence Briggs Th...</td>\n",
" <td>female</td>\n",
" <td>38.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>PC 17599</td>\n",
" <td>71.2833</td>\n",
" <td>C85</td>\n",
" <td>C</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>3</td>\n",
" <td>1</td>\n",
" <td>3</td>\n",
" <td>Heikkinen, Miss. Laina</td>\n",
" <td>female</td>\n",
" <td>26.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>STON/O2. 3101282</td>\n",
" <td>7.9250</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>4</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>Futrelle, Mrs. Jacques Heath (Lily May Peel)</td>\n",
" <td>female</td>\n",
" <td>35.0</td>\n",
" <td>1</td>\n",
" <td>0</td>\n",
" <td>113803</td>\n",
" <td>53.1000</td>\n",
" <td>C123</td>\n",
" <td>S</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>5</td>\n",
" <td>0</td>\n",
" <td>3</td>\n",
" <td>Allen, Mr. William Henry</td>\n",
" <td>male</td>\n",
" <td>35.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>373450</td>\n",
" <td>8.0500</td>\n",
" <td>NaN</td>\n",
" <td>S</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" PassengerId Survived Pclass \\\n",
"0 1 0 3 \n",
"1 2 1 1 \n",
"2 3 1 3 \n",
"3 4 1 1 \n",
"4 5 0 3 \n",
"\n",
" Name Sex Age SibSp \\\n",
"0 Braund, Mr. Owen Harris male 22.0 1 \n",
"1 Cumings, Mrs. John Bradley (Florence Briggs Th... female 38.0 1 \n",
"2 Heikkinen, Miss. Laina female 26.0 0 \n",
"3 Futrelle, Mrs. Jacques Heath (Lily May Peel) female 35.0 1 \n",
"4 Allen, Mr. William Henry male 35.0 0 \n",
"\n",
" Parch Ticket Fare Cabin Embarked \n",
"0 0 A/5 21171 7.2500 NaN S \n",
"1 0 PC 17599 71.2833 C85 C \n",
"2 0 STON/O2. 3101282 7.9250 NaN S \n",
"3 0 113803 53.1000 C123 S \n",
"4 0 373450 8.0500 NaN S "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"•PassengerIDID \n",
"•Survived(存活与否) \n",
"•Pclass客舱等级较为重要 \n",
"•Name姓名可提取出更多信息 \n",
"•Sex性别较为重要 \n",
"•Age年龄较为重要 \n",
"•Parch直系亲友 \n",
"•SibSp旁系 \n",
"•Ticket票编号 \n",
"•Fare票价 \n",
"•Cabin客舱编号 \n",
"•Embarked上船的港口编号"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'pandas.core.frame.DataFrame'>\n",
"RangeIndex: 891 entries, 0 to 890\n",
"Data columns (total 12 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 PassengerId 891 non-null int64 \n",
" 1 Survived 891 non-null int64 \n",
" 2 Pclass 891 non-null int64 \n",
" 3 Name 891 non-null object \n",
" 4 Sex 891 non-null object \n",
" 5 Age 714 non-null float64\n",
" 6 SibSp 891 non-null int64 \n",
" 7 Parch 891 non-null int64 \n",
" 8 Ticket 891 non-null object \n",
" 9 Fare 891 non-null float64\n",
" 10 Cabin 204 non-null object \n",
" 11 Embarked 889 non-null object \n",
"dtypes: float64(2), int64(5), object(5)\n",
"memory usage: 83.7+ KB\n"
]
}
],
"source": [
"train.info()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2..数据初步分析,使用统计学与绘图\n",
"目的:初步了解数据之间的相关性,为构造特征工程以及模型建立做准备"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0 549\n",
"1 342\n",
"Name: Survived, dtype: int64"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"train['Survived'].value_counts()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1)Sex Feature女性幸存率远高于男性"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='Sex', ylabel='Survived'>"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAT1UlEQVR4nO3df5BdZ33f8ffHaxQPxjih2o49koxUEHFU4uB4LZLmFxQ7kWnHSgMksp0JnrpomCKTgYBrClWoHJpGNKQkFSkidaFMQDimwyytUpWAQxODQevY2JUcka1skARq1pgfAjo2G3/7x71yLldXu1dIZ692z/s1s6N7nvPsuV9JV/vRec45z5OqQpLUXueMugBJ0mgZBJLUcgaBJLWcQSBJLWcQSFLLnTvqAk7V8uXLa/Xq1aMuQ5IWlXvvvffRqhoftG/RBcHq1auZmpoadRmStKgk+cLJ9jk0JEktZxBIUss1GgRJNiQ5kGQ6ya0D9l+S5K4k9yV5IMlLm6xHknSixoIgyRiwA7gGWAdcl2RdX7e3AHdU1eXAJuBdTdUjSRqsyTOC9cB0VR2sqieAXcDGvj4FPLP7+kLgSw3WI0kaoMkgWAEc6tk+3G3r9Vbgl5McBnYDNw86UJLNSaaSTM3MzDRRqyS11qgvFl8HvLeqVgIvBd6f5ISaqmpnVU1U1cT4+MDbYCVJ36Mmg+AIsKpne2W3rddNwB0AVfVp4DxgeYM1SZL6NPlA2V5gbZI1dAJgE3B9X58vAi8B3pvkh+gEgWM/UsvdcsstHD16lIsuuojt27ePupwlr7EgqKrZJFuAPcAYcHtV7UuyDZiqqkng14D3JHkdnQvHN5Yr5Uitd/ToUY4c6R9AUFManWKiqnbTuQjc27a15/V+4CearEGSNLdRXyyWJI2YQSBJLWcQSFLLGQSS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktdyiW7xeWsq+uO2HR13CWWH2sWcB5zL72Bf8MwEu2fpgo8f3jECSWs4gkKSWMwgkqeUMAklqOYNAklrOIJCkljMIJKnlGg2CJBuSHEgyneTWAft/J8n93a/PJ/lak/VIkk7U2ANlScaAHcDVwGFgb5LJ7vKUAFTV63r63wxc3lQ9kqTBmjwjWA9MV9XBqnoC2AVsnKP/dcAHG6xHkjRAk0GwAjjUs32423aCJM8G1gCfaLAeSdIAZ8vF4k3AnVX1N4N2JtmcZCrJ1MzMzAKXJklLW5NBcARY1bO9sts2yCbmGBaqqp1VNVFVE+Pj42ewRElSk7OP7gXWJllDJwA2Adf3d0pyKfADwKcbrEXSIrL8vCeB2e6valpjQVBVs0m2AHuAMeD2qtqXZBswVVWT3a6bgF1VVU3VImlxecNlXxt1Ca3S6HoEVbUb2N3XtrVv+61N1iBJmtvZcrFYkjQiBoEktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLLGQSS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEktZxBIUssZBJLUcgaBJLVco0GQZEOSA0mmk9x6kj6/mGR/kn1JPtBkPZKkEzW2ZnGSMWAHcDVwGNibZLKq9vf0WQu8CfiJqvpqkr/bVD2SpMGaPCNYD0xX1cGqegLYBWzs6/MqYEdVfRWgqv66wXokSQM0GQQrgEM924e7bb2eBzwvyd1J7kmyYdCBkmxOMpVkamZmpqFyJamdRn2x+FxgLfAi4DrgPUm+v79TVe2sqomqmhgfH1/YCiVpiWsyCI4Aq3q2V3bbeh0GJqvqO1X1MPB5OsEgSVogTQbBXmBtkjVJlgGbgMm+Ph+hczZAkuV0hooONliTJKlPY0FQVbPAFmAP8BBwR1XtS7ItybXdbnuAryTZD9wFvLGqvtJUTZKkEzV2+yhAVe0Gdve1be15XcDru1+SpBEY9cViSdKIGQSS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLLGQSS1HIGgSS1nEEgSS1nEEhSyzUaBEk2JDmQZDrJrQP235hkJsn93a9/1mQ9kqQTNbZUZZIxYAdwNXAY2Jtksqr293X9UFVtaaoOSdLcmjwjWA9MV9XBqnoC2AVsbPD9JEnfgyaDYAVwqGf7cLet38uSPJDkziSrBh0oyeYkU0mmZmZmmqhVklpr1BeLPwqsrqrLgI8B7xvUqap2VtVEVU2Mj48vaIGStNQ1GQRHgN7/4a/stj2lqr5SVY93N/8AuKLBeiRJAzQZBHuBtUnWJFkGbAImezskubhn81rgoQbrkSQN0NhdQ1U1m2QLsAcYA26vqn1JtgFTVTUJvDbJtcAs8BhwY1P1SJIGmzMIkhwD6mT7q+qZc31/Ve0Gdve1be15/SbgTUNVKklqxJxBUFUXACS5Dfgy8H4gwA3AxXN8qyRpkRj2GsG1VfWuqjpWVd+oqt/HZwIkaUkYNgi+leSGJGNJzklyA/CtJguTJC2MYYPgeuAXgf/b/XpFt02StMgNdddQVT2CQ0GStCQNdUaQ5HlJPp7kf3e3L0vylmZLkyQthGGHht5D5zbP7wBU1QN0HhCTJC1ywwbB06vqs31ts2e6GEnSwhs2CB5N8hy6D5cleTmd5wokSYvcsFNMvAbYCVya5AjwMJ2HyiRJi9ywQfCFqroqyfnAOVV1rMmiJEkLZ9ihoYeT7AR+DPhmg/VIkhbYsEFwKfAndIaIHk7yH5L8ZHNlSZIWylBBUFXfrqo7quoXgMuBZwKfbLQySdKCGHphmiQ/k+RdwL3AeXSmnJAkLXJDXSxO8ghwH3AH8MaqcsI5SVoihr1r6LKq+kajlUiSRmK+FcpuqartwNuSnLBSWVW9dp7v3wC8k85SlX9QVf/2JP1eBtwJXFlVU8MWL0k6ffOdERxfTP6UfzgnGQN2AFcDh4G9SSaran9fvwuAXwU+c6rvIUk6ffMtVfnR7ssHq+ovTvHY64HpqjoIkGQXnams9/f1uw34LeCNp3h8SdIZMOxdQ7+d5KEktyV5/pDfswI41LN9uNv2lCQ/Cqyqqv8+14GSbE4ylWRqZmZmyLeXJA1j2OcIXgy8GJgB3p3kwdNdjyDJOcA7gF8b4v13VtVEVU2Mj4+fzttKkvoM/RxBVR2tqt8FXg3cD2yd51uOAKt6tld22467AHg+8Kfd21N/DJhMMjFsTZKk0zfsCmU/lOStSR4Efg/4FJ0f7HPZC6xNsibJMjoL2Uwe31lVX6+q5VW1uqpWA/cA13rXkCQtrGGfI7gd2AX8XFV9aZhvqKrZJFuAPXRuH729qvYl2QZMVdXk3EeQJC2EeYOgexvow1X1zlM9eFXtBnb3tQ0cUqqqF53q8SVJp2/eoaGq+htgVXd4R5K0xAw7NPQwcHeSSeCpeYaq6h2NVCVJWjDDBsH/6X6dQ+duH0nSEjFUEFTVv266EEnSaAw7DfVdwKBJ5/7hGa9IkrSghh0aekPP6/OAlwGzZ74cSdJCG3Zo6N6+pruTfLaBeiRJC2zYoaFn9WyeA0wAFzZSkSRpQQ07NHQvf3uNYBZ4BLipiYIkSQtrvhXKrgQOVdWa7vYr6VwfeIQT1xWQJC1C8z1Z/G7gCYAkPw38JvA+4OvAzmZLkyQthPmGhsaq6rHu618CdlbVh4EPJ7m/0cokSQtivjOCsSTHw+IlwCd69g17fUGSdBab74f5B4FPJnkU+H/AnwEkeS6d4SFJ0iI33+L1b0vyceBi4H9W1fE7h84Bbm66OElS8+Yd3qmqewa0fb6ZciRJC23oNYslSUuTQSBJLddoECTZkORAkukktw7Y/+okDya5P8mfJ1nXZD2SpBM1FgTdtY53ANcA64DrBvyg/0BV/XBVvQDYDrjimSQtsCbPCNYD01V1sKqeAHYBG3s7VNU3ejbPZ8CaB5KkZjX5UNgK4FDP9mHghf2dkrwGeD2wDBi40E2SzcBmgEsuueSMFypJbTbyi8VVtaOqngP8C+AtJ+mzs6omqmpifHx8YQuUpCWuySA4Aqzq2V7ZbTuZXcDPN1iPJGmAJoNgL7A2yZoky4BNwGRvhyRrezb/EfBXDdYjSRqgsWsEVTWbZAuwBxgDbq+qfUm2AVNVNQlsSXIV8B3gq8Arm6pHkjRYozOIVtVuYHdf29ae17/a5PtLkuY38ovFkqTRMggkqeUMAklqOYNAklrOIJCkljMIJKnlDAJJajmDQJJaziCQpJYzCCSp5QwCSWo5g0CSWs4gkKSWMwgkqeUanYZaZ7dbbrmFo0ePctFFF7F9+/ZRlyNpRAyCFjt69ChHjsy1eqikNnBoSJJartEgSLIhyYEk00luHbD/9Un2J3kgyceTPLvJeiRJJ2osCJKMATuAa4B1wHVJ1vV1uw+YqKrLgDsBB6olaYE1eUawHpiuqoNV9QSwC9jY26Gq7qqqb3c37wFWNliPJGmAJoNgBXCoZ/twt+1kbgL+eNCOJJuTTCWZmpmZOYMlSpLOiovFSX4ZmADePmh/Ve2sqomqmhgfH1/Y4iRpiWvy9tEjwKqe7ZXdtu+S5CrgzcDPVNXjDdYjSRqgySDYC6xNsoZOAGwCru/tkORy4N3Ahqr66wZr+S5XvPG/LNRbndUuePQYY8AXHz3mnwlw79t/ZdQlSCPR2NBQVc0CW4A9wEPAHVW1L8m2JNd2u70deAbwR0nuTzLZVD2SpMEafbK4qnYDu/vatva8vqrJ95ckze+suFgsSRodg0CSWs4gkKSWMwgkqeUMAklqOYNAklrOhWla7Mll53/Xr5LaySBosW+t/dlRlyDpLODQkCS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEktZxBIUssZBJLUcgaBJLVco0GQZEOSA0mmk9w6YP9PJ/mLJLNJXt5kLZKkwRoLgiRjwA7gGmAdcF2SdX3dvgjcCHygqTokSXNrcq6h9cB0VR0ESLIL2AjsP96hqh7p7nuywTokSXNocmhoBXCoZ/twt+2UJdmcZCrJ1MzMzBkpTpLUsSguFlfVzqqaqKqJ8fHxUZcjSUtKk0FwBFjVs72y2yZJOos0GQR7gbVJ1iRZBmwCJht8P0nS96CxIKiqWWALsAd4CLijqvYl2ZbkWoAkVyY5DLwCeHeSfU3VI0karNEVyqpqN7C7r21rz+u9dIaMJEkjsiguFkuSmmMQSFLLGQSS1HIGgSS1nEEgSS1nEEhSyxkEktRyBoEktZxBIEktZxBIUssZBJLUcgaBJLWcQSBJLWcQSFLLGQSS1HIGgSS1nEEgSS1nEEhSyzUaBEk2JDmQZDrJrQP2f1+SD3X3fybJ6ibrkSSdqLEgSDIG7ACuAdYB1yVZ19ftJuCrVfVc4HeA32qqHknSYE2eEawHpqvqYFU9AewCNvb12Qi8r/v6TuAlSdJgTZKkPuc2eOwVwKGe7cPAC0/Wp6pmk3wd+DvAo72dkmwGNnc3v5nkQCMVt9Ny+v682yr/7pWjLkHfzc/mcb9+Rv5//OyT7WgyCM6YqtoJ7Bx1HUtRkqmqmhh1HVI/P5sLp8mhoSPAqp7tld22gX2SnAtcCHylwZokSX2aDIK9wNoka5IsAzYBk319JoHj5+MvBz5RVdVgTZKkPo0NDXXH/LcAe4Ax4Paq2pdkGzBVVZPAfwLen2QaeIxOWGhhOeSms5WfzQUS/wMuSe3mk8WS1HIGgSS1nEGgpyR5UZL/Nuo6tDQkeW2Sh5L8YUPHf2uSNzRx7LZZFM8RSFqU/jlwVVUdHnUhmptnBEtMktVJ/jLJe5N8PskfJrkqyd1J/irJ+u7Xp5Pcl+RTSX5wwHHOT3J7ks92+/VPDyKdVJL/CPw94I+TvHnQZynJjUk+kuRjSR5JsiXJ67t97knyrG6/VyXZm+RzST6c5OkD3u85Sf5HknuT/FmSSxf2d7y4GQRL03OB3wYu7X5dD/wk8AbgXwJ/CfxUVV0ObAX+zYBjvJnOcx3rgRcDb09y/gLUriWgql4NfInOZ+d8Tv5Zej7wC8CVwNuAb3c/l58GfqXb579W1ZVV9SPAQ3Qmq+y3E7i5qq6g8zl/VzO/s6XJoaGl6eGqehAgyT7g41VVSR4EVtN5gvt9SdYCBTxtwDF+Fri2Zwz2POASOv8QpVNxss8SwF1VdQw41p1r7KPd9geBy7qvn5/kN4DvB55B59mkpyR5BvAPgD/qmbPy+xr4fSxZBsHS9HjP6yd7tp+k83d+G51/gP+kuwbEnw44RoCXVZUT/Ol0DfwsJXkh839WAd4L/HxVfS7JjcCL+o5/DvC1qnrBGa26RRwaaqcL+dt5n248SZ89wM3HpwVPcvkC1KWl6XQ/SxcAX07yNOCG/p1V9Q3g4SSv6B4/SX7kNGtuFYOgnbYDv5nkPk5+VngbnSGjB7rDS7ctVHFack73s/SvgM8Ad9O5vjXIDcBNST4H7OPEtU80B6eYkKSW84xAklrOIJCkljMIJKnlDAJJajmDQJJaziCQTkF33px9SR5Icn/3oShpUfPJYmlISX4c+MfAj1bV40mWA8tGXJZ02jwjkIZ3MfBoVT0OUFWPVtWXklyR5JPdmS/3JLk4yYVJDhyf2TXJB5O8aqTVSyfhA2XSkLqTm/058HTgT4APAZ8CPglsrKqZJL8E/FxV/dMkVwPbgHcCN1bVhhGVLs3JoSFpSFX1zSRXAD9FZzrlDwG/QWcq5Y91p9IZA77c7f+x7vw3OwDnvtFZyzMC6XuU5OXAa4DzqurHB+w/h87ZwmrgpcenBpfONl4jkIaU5Ae7azgc9wI66zOMdy8kk+RpSf5+d//ruvuvB/5zd/ZM6azjGYE0pO6w0O/RWSBlFpgGNgMrgd+lM733ucC/B/4X8BFgfVUdS/IO4FhV/fqCFy7NwyCQpJZzaEiSWs4gkKSWMwgkqeUMAklqOYNAklrOIJCkljMIJKnl/j/Rj08QtS+qvAAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.barplot(x=\"Sex\", y=\"Survived\", data=train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2)Pclass Feature乘客社会等级越高幸存率越高"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='Pclass', ylabel='Survived'>"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAASxklEQVR4nO3dcZBdZ33e8e9jOarBOKEgtfJYKlZAlDrUE8pG6dQdIAS3opmxMgVSOW4Sz1BUZiKgzYAwbeOCKO1EJGQaqiQojSeECQgDbbNp1agUO0BcbLQCYyM5oooMSCob1jYGm9DIsn/9Y4/oZXW1e2Xv2avV+/3M3NE973nvub87d0bPnve95z2pKiRJ7bpo3AVIksbLIJCkxhkEktQ4g0CSGmcQSFLjLh53Aedq1apVdeWVV467DElaVg4cOPBAVa0etm/ZBcGVV17J1NTUuMuQpGUlyVfOts+hIUlqnEEgSY3rNQiSbEpyOMmRJDcN2f+rSe7uHl9K8nCf9UiSztTbHEGSFcAu4FrgOLA/yWRVHTrdp6r++UD/NwAv6qseSdJwfZ4RbASOVNXRqjoJ7AE2z9P/euBDPdYjSRqizyC4Ajg2sH28aztDkucA64HbeqxHkjTE+TJZvAX4aFU9Pmxnkq1JppJMzczMLHFpknRh6zMITgDrBrbXdm3DbGGeYaGq2l1VE1U1sXr10OshJElPUp8XlO0HNiRZz2wAbAF+em6nJC8A/jLwmR5rWRa2b9/O9PQ0a9asYefOneMuR1IjeguCqjqVZBuwD1gB3FJVB5PsAKaqarLrugXYU94hh+npaU6cONtJkyT1o9clJqpqL7B3TtvNc7bf3mcNkqT5nS+TxZKkMTEIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJalyvN6YZtxe/5XfHXcI5ueyBR1gBfPWBR5ZV7Qfe/bPjLkHSU+AZgSQ1ziCQpMYZBJLUOINAkhpnEEhS43oNgiSbkhxOciTJTWfp81NJDiU5mOSDfdYjSTpTbz8fTbIC2AVcCxwH9ieZrKpDA302AG8DrqmqbyT5K33VI0kars8zgo3Akao6WlUngT3A5jl9XgfsqqpvAFTV13usR5I0RJ9BcAVwbGD7eNc26PnA85PckeTOJJuGHSjJ1iRTSaZmZmZ6KleS2jTuyeKLgQ3Ay4Drgd9K8sy5napqd1VNVNXE6tWrl7ZCSbrA9RkEJ4B1A9tru7ZBx4HJqnqsqu4HvsRsMEiSlkifQbAf2JBkfZKVwBZgck6f/8Ls2QBJVjE7VHS0x5okSXP0FgRVdQrYBuwD7gNuraqDSXYkua7rtg94MMkh4HbgLVX1YF81SZLO1Ovqo1W1F9g7p+3mgecF/EL3kCSNwbgniyVJY2YQSFLjDAJJapxBIEmNMwgkqXEX9D2Ll5snVl76Pf9K0lIwCM4j397w98ZdgqQGOTQkSY0zCCSpcQaBJDXOOQJpEWzfvp3p6WnWrFnDzp07x12OdE4MAmkRTE9Pc+LE3FXWpeXBoSFJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWpcr0GQZFOSw0mOJLlpyP4bk8wkubt7/JM+65Eknam3JSaSrAB2AdcCx4H9SSar6tCcrh+uqm191SFJml+fZwQbgSNVdbSqTgJ7gM09vp8k6UnoMwiuAI4NbB/v2uZ6VZJ7knw0ybphB0qyNclUkqmZmZk+apWkZo17svgPgCur6mrg48D7h3Wqqt1VNVFVE6tXr17SAiXpQtdnEJwABv/CX9u1fVdVPVhVf9Ft/kfgxT3WI0kaos8g2A9sSLI+yUpgCzA52CHJ5QOb1wH39ViPJGmI3n41VFWnkmwD9gErgFuq6mCSHcBUVU0Cb0xyHXAKeAi4sa96JEnD9XqHsqraC+yd03bzwPO3AW/rswZJ0vzGPVksSRozg0CSGufN63Xe+uqOvznuEkZ26qFnARdz6qGvLKu6/9rN9467BJ0HPCOQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY2bdxnqJI8Adbb9VfX9i16RJGlJzRsEVXUZQJJ3Al8DPgAEuAG4fJ6XSpKWiVGHhq6rql+vqkeq6ltV9RvA5j4LkyQtjVGD4NtJbkiyIslFSW4Avt1nYZKkpTFqEPw08FPAn3WP13Rt80qyKcnhJEeS3DRPv1clqSQTI9YjSVokI92zuKq+zDkOBSVZAewCrgWOA/uTTFbVoTn9LgPeBNx1LseXJC2Okc4Ikjw/ySeSfLHbvjrJv1rgZRuBI1V1tKpOAnsYHibvBH4J+L/nULckaZGMOjT0W8DbgMcAquoeYMsCr7kCODawfbxr+64kfwtYV1X/bb4DJdmaZCrJ1MzMzIglS0tn1SVP8FefdopVlzwx7lKkczbS0BDw9Kr6bJLBtlNP5Y2TXAS8B7hxob5VtRvYDTAxMXHW6xqkcXnz1Q+PuwTpSRv1jOCBJM+lu7gsyauZva5gPieAdQPba7u20y4DXgj8UZIvA38bmHTCWJKW1qhnBD/P7F/kL0hyArif2YvK5rMf2JBkPbMBsIWBXxpV1TeBVae3k/wR8Oaqmhq5eknSUzZqEHylql6R5FLgoqp6ZKEXVNWpJNuAfcAK4JaqOphkBzBVVZNPvmxJ0mIZNQjuT/KHwIeB20Y9eFXtBfbOabv5LH1fNupxJUmLZ9Q5ghcA/5PZIaL7k/yHJH+3v7IkSUtlpCCoqj+vqlur6h8CLwK+H/hkr5VJkpbEyPcjSPLSJL8OHAAuYXbJCUnSMjfSHEH3887PA7cCb6kqF5yTpAvEqJPFV1fVt3qtRJI0FgvdoWx7Ve0E3pXkjCt6q+qNvVUmSVoSC50R3Nf960VeknSBWuhWlX/QPb23qj63BPVIkpbYqL8a+pUk9yV5Z5IX9lqRJGlJjXodwY8BPwbMAO9Lcu8I9yOQJC0DI19HUFXTVfVrwOuBu4GhS0VIkpaXUe9Q9jeSvD3JvcB7gf/F7LLSkqRlbtTrCG5h9laTf7+q/k+P9UiSltiCQdDdhP7+qvr3S1CPJGmJLTg0VFWPA+uSrFyCeiRJS2zk+xEAdySZBL67zlBVvaeXqiRJS2bUIPjT7nERs/caliRdIEYKgqp6R9+FSJLGY9RlqG8Hhi069/JFr0iStKRGHRp688DzS4BXAacWvxxJ0lIbdWjowJymO5J8tod6JElLbNQri5818FiVZBPwAyO8blOSw0mOJLlpyP7Xd+sW3Z3kj5Nc9SQ+gyTpKRh1aOgA/3+O4BTwZeC1872guxBtF3AtcBzYn2Syqg4NdPtgVf1m1/864D3AppGrlyQ9ZfOeEST5kSRrqmp9Vf0g8A7gT7rHofleC2wEjlTV0ao6yewSFZsHO8y5/eWlDJmQliT1a6GhofcBJwGSvAT4d8D7gW8Cuxd47RXAsYHt413b90jy80n+FNgJDL31ZZKtSaaSTM3MzCzwtpKkc7FQEKyoqoe65/8I2F1VH6uqXwSetxgFVNWuqnou8FZg6D0Oqmp3VU1U1cTq1asX420lSZ0FgyDJ6XmEHwduG9i30PzCCWDdwPbaru1s9gA/ucAxJUmLbKEg+BDwySS/D3wH+DRAkucxOzw0n/3AhiTruwXrtgCTgx2SbBjY/Angf59D7ZKkRbDQzevfleQTwOXA/6iq05O5FwFvWOC1p5JsA/YBK4Bbqupgkh3AVFVNAtuSvAJ4DPgG8HNP7eNIks7Vgj8frao7h7R9aZSDV9VeYO+ctpsHnr9plONIUp+2b9/O9PQ0a9asYefOneMuZ8mNeh2BJF2wpqenOXFivinMC9vIN6+XJF2YDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcS0xIWnTXvPeacZdwTlY+vJKLuIhjDx9bVrXf8YY7FuU4nhFIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjeg2CJJuSHE5yJMlNQ/b/QpJDSe5J8okkz+mzHknSmXoLgiQrgF3AK4GrgOuTXDWn2+eBiaq6GvgosLOveiRJw/V5RrAROFJVR6vqJLAH2DzYoapur6o/7zbvBNb2WI8kaYg+g+AK4NjA9vGu7WxeC/z3YTuSbE0ylWRqZmZmEUuUJKinF09c+gT19Bp3KWNxXqw+muQfAxPAS4ftr6rdwG6AiYmJNr8pSb157JrHxl3CWPUZBCeAdQPba7u275HkFcC/BF5aVX/RYz2SpCH6HBraD2xIsj7JSmALMDnYIcmLgPcB11XV13usRZJ0Fr0FQVWdArYB+4D7gFur6mCSHUmu67q9G3gG8JEkdyeZPMvhJEk96XWOoKr2AnvntN088PwVfb6/JGlhXlksSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1LhegyDJpiSHkxxJctOQ/S9J8rkkp5K8us9aJEnD9RYESVYAu4BXAlcB1ye5ak63rwI3Ah/sqw5J0vwu7vHYG4EjVXUUIMkeYDNw6HSHqvpyt++JHuuQJM2jz6GhK4BjA9vHu7ZzlmRrkqkkUzMzM4tSnCRp1rKYLK6q3VU1UVUTq1evHnc5knRB6TMITgDrBrbXdm2SpPNIn0GwH9iQZH2SlcAWYLLH95MkPQm9BUFVnQK2AfuA+4Bbq+pgkh1JrgNI8iNJjgOvAd6X5GBf9UiShuvzV0NU1V5g75y2mwee72d2yEiSNCbLYrJYktQfg0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY3rNQiSbEpyOMmRJDcN2f+Xkny4239Xkiv7rEeSdKbegiDJCmAX8ErgKuD6JFfN6fZa4BtV9TzgV4Ff6qseSdJwfZ4RbASOVNXRqjoJ7AE2z+mzGXh/9/yjwI8nSY81SZLmuLjHY18BHBvYPg786Nn6VNWpJN8Eng08MNgpyVZga7f5aJLDvVR8fljFnM9/vssv/9y4SzhfLLvvjn/t310Dlt33lzee0/f3nLPt6DMIFk1V7QZ2j7uOpZBkqqomxl2Hzp3f3fLW8vfX59DQCWDdwPbarm1onyQXAz8APNhjTZKkOfoMgv3AhiTrk6wEtgCTc/pMAqfHFV4N3FZV1WNNkqQ5ehsa6sb8twH7gBXALVV1MMkOYKqqJoHfBj6Q5AjwELNh0bomhsAuUH53y1uz31/8A1yS2uaVxZLUOINAkhpnEJwnktyS5OtJvjjuWnRukqxLcnuSQ0kOJnnTuGvS6JJckuSzSb7QfX/vGHdNS805gvNEkpcAjwK/W1UvHHc9Gl2Sy4HLq+pzSS4DDgA/WVWHxlyaRtCtZnBpVT2a5PuAPwbeVFV3jrm0JeMZwXmiqj7F7C+ntMxU1deq6nPd80eA+5i9al7LQM16tNv8vu7R1F/IBoG0iLoVdF8E3DXmUnQOkqxIcjfwdeDjVdXU92cQSIskyTOAjwH/rKq+Ne56NLqqeryqfpjZFRA2JmlqeNYgkBZBN7b8MeD3quo/jbsePTlV9TBwO7BpzKUsKYNAeoq6ycbfBu6rqveMux6dmySrkzyze/404FrgT8Za1BIzCM4TST4EfAb460mOJ3ntuGvSyK4BfgZ4eZK7u8c/GHdRGtnlwO1J7mF2jbSPV9V/HXNNS8qfj0pS4zwjkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgzZHk8e4noF9M8pEkT5+n79uTvHkp65MWm0Egnek7VfXD3SqwJ4HXj7sgqU8GgTS/TwPPA0jys0nu6dat/8Dcjklel2R/t/9jp88kkrymO7v4QpJPdW0/1K2Bf3d3zA1L+qmkAV5QJs2R5NGqekaSi5ldP+gPgU8B/xn4O1X1QJJnVdVDSd4OPFpVv5zk2VX1YHeMfwP8WVW9N8m9wKaqOpHkmVX1cJL3AndW1e8lWQmsqKrvjOUDq3meEUhnelq3JPEU8FVm1xF6OfCRqnoAoKqG3TvihUk+3f3HfwPwQ137HcDvJHkdsKJr+wzwL5K8FXiOIaBxunjcBUjnoe90SxJ/1+y6cgv6HWbvTPaFJDcCLwOoqtcn+VHgJ4ADSV5cVR9MclfXtjfJP62q2xbvI0ij84xAGs1twGuSPBsgybOG9LkM+Fq3JPUNpxuTPLeq7qqqm4EZYF2SHwSOVtWvAb8PXN37J5DOwjMCaQRVdTDJu4BPJnkc+Dxw45xuv8jsnclmun8v69rf3U0GB/gE8AXgrcDPJHkMmAb+be8fQjoLJ4slqXEODUlS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1Lj/B4zNFbon9FfyAAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.barplot(x=\"Pclass\", y=\"Survived\", data=train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3)SibSp Feature配偶及兄弟姐妹数适中的乘客幸存率更高"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='SibSp', ylabel='Survived'>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAATnElEQVR4nO3dfZBdd33f8ffHclTHxkmaeFM5lhx7GkGqEoeHjXFqBvNgp6Kk0kxxiG0eZyBqZlCg5UFjTxiXmMl0IhJISxSKQjylpCBck7SiUWso2Ka4AbQGYyOpAmEbS4KtJYyJMdS27G//uEfuZX2lvSvtuVdX5/2aubPn4XfOftcj72fP75zz+6WqkCR11ynjLkCSNF4GgSR1nEEgSR1nEEhSxxkEktRxp467gIU666yz6rzzzht3GZI0UW6//faDVTU1aN/EBcF5553HzMzMuMuQpImS5JtH2mfXkCR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkdZxBIUscZBJLUcRP3QplOXhs2bGB2dpZly5axcePGcZcjdYZBoBPG7Ows+/fvH3cZUufYNSRJHWcQSFLHGQSS1HEGgSR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkd12oQJFmdZHeSPUmuPkKbVyTZmWRHko+0Wc/JbsOGDbzmNa9hw4YN4y5F0gRpbYiJJEuATcBlwD5ge5KtVbWzr81K4Brg4qr6bpKfbaueLnCIBknHos0rgguBPVV1d1U9CmwB1s5p81vApqr6LkBV3d9iPZKkAdoMgnOAvX3r+5pt/Z4OPD3JbUk+n2R1i/VIkgYY9+ijpwIrgRcCy4HPJvmlqnqwv1GSdcA6gHPPPXfEJUrSya3NK4L9wIq+9eXNtn77gK1V9VhV3QN8jV4w/Iiq2lxV01U1PTU11VrBktRFbQbBdmBlkvOTLAWuALbOafOf6V0NkOQsel1Fd7dYkyRpjtaCoKoOAeuBm4BdwA1VtSPJdUnWNM1uAr6TZCdwM/D2qvpOWzVJkp6q1XsEVbUN2DZn27V9ywW8pflIksbAN4slqeMMAknqOINAkjrOIJCkjjMIJKnjDAJJ6jiDQJI6ziCQpI4zCCSp4wwCSeo4g0CSOs4gkKSOMwgkqeMMAknqOINAkjpu3HMWC7jvul9alPMceuCngVM59MA3F+Wc51571/EXJemE5xWBJHWcQSBJHWcQSFLHGQSS1HEGgSR1nEEgSR1nEEhSxxkEktRxrQZBktVJdifZk+TqAftfl+RAkjuazxvarEeS9FStvVmcZAmwCbgM2AdsT7K1qnbOafqxqlrfVh2SpKNrc4iJC4E9VXU3QJItwFpgbhBIE2/Dhg3Mzs6ybNkyNm7cOO5ypAVps2voHGBv3/q+ZttcL09yZ5Ibk6wYdKIk65LMJJk5cOBAG7VKx2V2dpb9+/czOzs77lKkBRv3zeJPAOdV1QXAp4APDWpUVZurarqqpqempkZaoCSd7NoMgv1A/1/4y5ttT6qq71TVI83qB4HntliPJGmANoNgO7AyyflJlgJXAFv7GyQ5u291DbCrxXokSQO0drO4qg4lWQ/cBCwBrq+qHUmuA2aqaivwpiRrgEPAA8Dr2qpHkjRYqxPTVNU2YNucbdf2LV8DXNNmDZKkoxv3zWJJ0pgZBJLUcQaBJHWcQSBJHWcQSFLHtfrUkEbrrNOeAA41X0fn4vddvCjnWfrgUk7hFPY+uHdRznnb79y2CFVJJz+D4CTytgseHHcJkiaQXUOS1HEGgSR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkdZxBIUscZBJLUcQaBJHWcQSBJHWcQSFLHGQSS1HEGgSR1nEEgSR1nEEhSx7UaBElWJ9mdZE+Sq4/S7uVJKsl0m/VIkp6qtSBIsgTYBLwUWAVcmWTVgHZnAm8GvtBWLZKkI2vziuBCYE9V3V1VjwJbgLUD2r0L+APg/7ZYiyTpCNoMgnOAvX3r+5ptT0ryHGBFVf310U6UZF2SmSQzBw4cWPxKJanDxnazOMkpwHuAt87Xtqo2V9V0VU1PTU21X5wkdcipR9uZ5CGgjrS/qn7iKIfvB1b0rS9vth12JvBM4JYkAMuArUnWVNXMPHVLkhbJUYOgqs4ESPIu4NvAh4EArwTOnufc24GVSc6nFwBXAFf1nft7wFmH15PcArxtnCGwYcMGZmdnWbZsGRs3bhxXGZI0UkcNgj5rquqX+9bfn+QrwLVHOqCqDiVZD9wELAGur6odSa4DZqpq6zFX3ZLZ2Vn2798/f0NJOokMGwQPJ3klvSd/CrgSeHi+g6pqG7BtzraB4VFVLxyyFknSIhr2ZvFVwCuA/9N8foO+bh5J0uQa6oqgqu5l8DsAkqQJN9QVQZKnJ/l0kq826xckeUe7pUmSRmHYrqE/A64BHgOoqjvpPQUkSZpwwwbB6VX1xTnbDi12MZKk0Rs2CA4m+fs0L5cluZzeewWSpAk37OOjbwQ2A7+YZD9wD72XyiRJE27YIPhmVV2a5AzglKp6qM2iJEmjM2zX0D1JNgMXAd9vsR5J0ogNGwS/CPwPel1E9yT5kyTPb68sSdKoDBUEVfWDqrqhqv4Z8GzgJ4BbW61MkjQSQ89HkOSSJH8K3A6cRm/ICUnShBvqZnGSe4EvAzcAb6+qeQecG6Xnvv0/LMp5zjz4EEuA+w4+tCjnvP3drzn+oiSpZcM+NXRBVf1tq5VIksZivhnKNlTVRuD3kzxlprKqelNrlUkaGSdl6rb5rgh2NV+dOlI6iTkpU7fNN1XlJ5rFu6rqSyOoR5I0YsM+NfRHSXYleVeSZ7ZakSRppIZ9j+BFwIuAA8AHktzlfASSdHIY+j2Cqpqtqn8L/DZwB0eZuF6SNDmGnaHsHyR5Z5K7gPcB/wtY3mplkqSRGPY9guuBLcA/rqpvtViPJGnE5g2CJEuAe6rq34ygHknSiM3bNVRVjwMrkixd6MmTrE6yO8meJFcP2P/bzY3nO5J8LsmqhX4PSdLxGbZr6B7gtiRbgSfHGaqq9xzpgOZKYhNwGbAP2J5ka1Xt7Gv2kar6d037NcB7gNUL+xEkScdj2CD4RvM5BThzyGMuBPZU1d0ASbYAa4Eng2DO+EVn0MyJPC5PLD3jR75KUhcMFQRV9XvHcO5zgL196/uA581tlOSNwFuApcCLB50oyTpgHcC55557DKUM5+GVv9bauSXpRDXsMNQ3M+Cv9aoa+It7IapqE7ApyVXAO4DXDmizGdgMMD09PdarBkk62QzbNfS2vuXTgJcDh+Y5Zj+wom99ebPtSLYA7x+yHknSIhm2a+j2OZtuS/LFeQ7bDqxMcj69ALgCuKq/QZKVVfX1ZvVlwNeRJI3UsF1DP923egowDfzk0Y6pqkNJ1gM3AUuA66tqR5LrgJmq2gqsT3Ip8BjwXQZ0C0mS2jVs19Dt/P97BIeAe4HXz3dQVW0Dts3Zdm3f8puH/P6SpJbMN0PZrwB7q+r8Zv219O4P3EvfY6DSYqjTiyd4gjrd5wGkUZrvzeIPAI8CJHkB8K+BDwHfo3mKR1osj138GI9e9iiPXfzYuEuROmW+rqElVfVAs/ybwOaq+jjw8SR3tFqZJGkk5rsiWJLkcFi8BPhM375h7y9Ikk5g8/0y/yhwa5KDwA+B/wmQ5BfodQ9JkibcfJPX/36STwNnA5+sqsN38U4Bfqft4iRJ7Zu3e6eqPj9g29faKUeSNGpDz1ksSTo5GQSS1HEGgSR1nI+AqtNufcEli3KeH566BBJ+uG/fop3zks/euijnkebjFYEkdZxBIEkdZxBIUscZBJLUcQaBJHWcQSBJHWcQSFLHGQSS1HEGgSR1nEEgSR1nEEhSxxkEktRxrQZBktVJdifZk+TqAfvfkmRnkjuTfDrJz7dZjyTpqVoLgiRLgE3AS4FVwJVJVs1p9mVguqouAG4ENrZVjyRpsDavCC4E9lTV3VX1KLAFWNvfoKpurqofNKufB5a3WI8kaYA2g+AcYG/f+r5m25G8Hvhvg3YkWZdkJsnMgQMHFrFESdIJcbM4yauAaeDdg/ZX1eaqmq6q6ampqdEWJ0knuTZnKNsPrOhbX95s+xFJLgV+F7ikqh5psR5J0gBtXhFsB1YmOT/JUuAKYGt/gyTPBj4ArKmq+1usRZJ0BK0FQVUdAtYDNwG7gBuqakeS65KsaZq9G3ga8J+S3JFk6xFOJ0lqSauT11fVNmDbnG3X9i1f2ub3lyTN74S4WSxJGh+DQJI6ziCQpI4zCCSp4wwCSeq4Vp8aktSuP3nrJxblPA8efPjJr4txzvV/9E+P+xwaHa8IJKnjDAJJ6jiDQJI6ziCQpI4zCCSp4wwCSeo4g0CSOs4gkKSOMwgkqeMMAknqOINAkjrOIJCkjjMIJKnjDAJJ6jiDQJI6ziCQpI4zCCSp41oNgiSrk+xOsifJ1QP2vyDJl5IcSnJ5m7VIkgZrLQiSLAE2AS8FVgFXJlk1p9l9wOuAj7RVhyTp6Nqcs/hCYE9V3Q2QZAuwFth5uEFV3dvse6LFOiRJR9Fm19A5wN6+9X3NNknSCWQibhYnWZdkJsnMgQMHxl2OJJ1U2gyC/cCKvvXlzbYFq6rNVTVdVdNTU1OLUpwkqafNINgOrExyfpKlwBXA1ha/nyTpGLQWBFV1CFgP3ATsAm6oqh1JrkuyBiDJryTZB/wG8IEkO9qqR5I0WJtPDVFV24Btc7Zd27e8nV6XkSRpTCbiZrEkqT0GgSR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkdZxBIUse1+kKZ1BU/VfUjX6VJYhBIi+BVjzulhiaXXUOS1HEGgSR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkdZxBIUscZBJLUcQaBJHWcQSBJHWcQSFLHGQSS1HEGgSR1XKtBkGR1kt1J9iS5esD+v5PkY83+LyQ5r816JElP1VoQJFkCbAJeCqwCrkyyak6z1wPfrapfAN4L/EFb9UiSBmvziuBCYE9V3V1VjwJbgLVz2qwFPtQs3wi8JElarEmSNEeqpan1klwOrK6qNzTrrwaeV1Xr+9p8tWmzr1n/RtPm4JxzrQPWNavPAHa3UnTPWcDBeVuduKx/fCa5drD+cWu7/p+vqqlBOyZiqsqq2gxsHsX3SjJTVdOj+F5tsP7xmeTawfrHbZz1t9k1tB9Y0be+vNk2sE2SU4GfBL7TYk2SpDnaDILtwMok5ydZClwBbJ3TZivw2mb5cuAz1VZflSRpoNa6hqrqUJL1wE3AEuD6qtqR5Dpgpqq2An8OfDjJHuABemExbiPpgmqR9Y/PJNcO1j9uY6u/tZvFkqTJ4JvFktRxBoEkdZxB0JhvOIwTXZLrk9zfvJsxUZKsSHJzkp1JdiR587hrWogkpyX5YpKvNPX/3rhrOhZJliT5cpL/Ou5aFirJvUnuSnJHkplx17MQSf5l8+/mq0k+muS0UddgEDD0cBgnun8PrB53EcfoEPDWqloFXAS8ccL++z8CvLiqfhl4FrA6yUXjLemYvBnYNe4ijsOLqupZk/QuQZJzgDcB01X1THoP1oz8oRmDoGeY4TBOaFX1WXpPXk2cqvp2VX2pWX6I3i+jc8Zb1fCq5/vN6o81n4l6CiPJcuBlwAfHXUsHnQr8ePMu1enAt0ZdgEHQcw6wt299HxP0i+hk0oxA+2zgC2MuZUGabpU7gPuBT1XVRNUP/DGwAXhizHUcqwI+meT2ZkiaiVBV+4E/BO4Dvg18r6o+Oeo6DAKdMJI8Dfg48C+q6m/HXc9CVNXjVfUsem/QX5jkmWMuaWhJfh24v6puH3ctx+H5VfUcet27b0zygnEXNIwkf5de78P5wM8BZyR51ajrMAh6hhkOQy1K8mP0QuA/VtVfjrueY1VVDwI3M1n3ay4G1iS5l1636IuT/MV4S1qY5i9rqup+4K/odfdOgkuBe6rqQFU9Bvwl8I9GXYRB0DPMcBhqSTP0+J8Du6rqPeOuZ6GSTCX5qWb5x4HLgP891qIWoKquqarlVXUevX/7n6mqkf9VeqySnJHkzMPLwK8Bk/L03H3ARUlOb/4/eAljuGFvENAbDgM4PBzGLuCGqtox3qoWJslHgb8BnpFkX5LXj7umBbgYeDW9v0TvaD7/ZNxFLcDZwM1J7qT3R8WnqmriHsGcYH8P+FySrwBfBP66qv77mGsaSnMv6UbgS8Bd9H4nj3yoCYeYkKSO84pAkjrOIJCkjjMIJKnjDAJJ6jiDQJI6ziCQjiDJ7zajQt7ZPNL6vCQfPDwgXpLvH+G4i5J8oTlmV5J3jrRwaYFam6pSmmRJfhX4deA5VfVIkrOApVX1hiEO/xDwiqr6SjOy7TParFU6Xl4RSIOdDRysqkcAqupgVX0ryS1JnhzmOMl7m6uGTyeZajb/LL0BxA6PQbSzafvOJB9O8jdJvp7kt0b8M0kDGQTSYJ8EViT5WpI/TXLJgDZnADNV9Q+BW4F/1Wx/L7A7yV8l+edzJhq5AHgx8KvAtUl+rsWfQRqKQSAN0Mwv8FxgHXAA+FiS181p9gTwsWb5L4DnN8deB0zTC5OrgP7hDv5LVf2wqg7SG5xuUgZH00nMewTSEVTV48AtwC1J7gJeO98hfcd+A3h/kj8DDiT5mbltjrAujZxXBNIASZ6RZGXfpmcB35zT7BTg8mb5KuBzzbEva0aSBFgJPA482KyvbeY4/hnghfQGqZPGyisCabCnAe9rhpc+BOyh1010Y1+bh+lNQvMOejOT/Waz/dXAe5P8oDn2lVX1eJMNd9LrEjoLeFdVjXxaQmkuRx+VRqR5n+D7VfWH465F6mfXkCR1nFcEktRxXhFIUscZBJLUcQaBJHWcQSBJHWcQSFLH/T+4iGKupE73VwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.barplot(x=\"SibSp\", y=\"Survived\", data=train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"4)Parch Feature父母与子女数适中的乘客幸存率更高"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='Parch', ylabel='Survived'>"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAScUlEQVR4nO3dfZBdd33f8fdHEo5jx9hlpCKPJUWeopC4hMRUNbTOgHlKbUrtTqGpnTimGRpNO5jSIbDjNBnHMWU6EVPawjgEBRwekqAYUzpqqsZpwQHiFrDEk7EcM4pt0CpsbWPsGOPGyP72j3vkXFar3bvSnnt19Xu/Znb2nnN/e+5nPR599vzOU6oKSVK7Vk06gCRpsiwCSWqcRSBJjbMIJKlxFoEkNW7NpAMs19q1a2vz5s2TjiFJU2Xv3r0PVtW6hd6buiLYvHkze/bsmXQMSZoqSb5+tPecGpKkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmN660IktyY5P4kXz3K+0nyriT7k3wlyQv6yiJJOro+9wg+AFy8yPuXAFu6r23Ae3rMIkk6it4uKKuqTyfZvMiQy4AP1eCBCJ9NclaSs6vqm31lkvoyMzPD3Nwc69evZ/v27ZOOIy3LJK8sPgc4MLQ82607ogiSbGOw18CmTZvGEk5ajrm5OQ4ePDjpGNIxmYqDxVW1o6q2VtXWdesWvFWGJOkYTbIIDgIbh5Y3dOskSWM0ySLYBVzVnT30IuARjw9I0vj1dowgyUeAi4C1SWaBXwOeAVBVvwXsBl4F7Ae+C/xCX1kkSUfX51lDVyzxfgFv6OvzJUmjmYqDxZKk/lgEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1rtciSHJxkruT7E9yzQLvb0pya5IvJvlKklf1mUeSdKTeiiDJauAG4BLgPOCKJOfNG/arwE1VdT5wOfCbfeWRJC2szz2CC4D9VXVPVT0B7AQumzemgGd2r88E/qLHPJKkBfRZBOcAB4aWZ7t1w64DrkwyC+wG3rjQhpJsS7InyZ4HHnigj6yS1KxJHyy+AvhAVW0AXgV8OMkRmapqR1Vtraqt69atG3tISTqZ9VkEB4GNQ8sbunXDXg/cBFBV/wc4FVjbYyZJ0jx9FsHtwJYk5yY5hcHB4F3zxnwDeDlAkh9jUATO/UjSGPVWBFV1CLgauAW4i8HZQXcmuT7Jpd2wXwJ+McmXgY8A/7yqqq9MkqQjrelz41W1m8FB4OF11w693gdc2GcGSdLiJn2wWJI0YRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDXOIpCkxq2ZdACtnJmZGebm5li/fj3bt2+fdBxJU8IiOInMzc1x8ODBSceQNGWcGpKkxlkEktQ4i0CSGtdrESS5OMndSfYnueYoY34myb4kdyb5/T7z6MQ2MzPDVVddxczMzKSjSE1Z9GBxkkeBOtr7VfXMRX52NXAD8EpgFrg9ya6q2jc0Zgvwy8CFVfXtJH9zmfl1EvFgtzQZixZBVZ0BkORtwDeBDwMBfg44e4ltXwDsr6p7um3sBC4D9g2N+UXghqr6dvd59x/D7yBJOg6jnj56aVX9xNDye5J8Gbh2kZ85BzgwtDwLvHDemB8BSHIbsBq4rqr+aMRMkgR4Dc3xGrUIHkvyc8BOBlNFVwCPrdDnbwEuAjYAn07y41X18PCgJNuAbQCbNm1agY+VBj714pesyHYeX7MaEh6fnV2xbb7k059ake20wGnF4zPqweKfBX4G+L/d1z/t1i3mILBxaHlDt27YLLCrqr5XVfcCX2NQDN+nqnZU1daq2rpu3boRI0uSRjHSHkFV3cdgfn85bge2JDmXQQFczpHl8V8Z7F38TpK1DKaK7lnm50iSjsNIewRJfiTJJ5J8tVt+fpJfXexnquoQcDVwC3AXcFNV3Znk+iSXdsNuAb6VZB9wK/DWqvrWsf4ykqTlG/UYwW8DbwXeC1BVX+nO+f93i/1QVe0Gds9bd+3Q6wLe3H1JkiZg1GMEp1XV5+etO7TSYSRJ4zdqETyY5G/RXVyW5LUMriuQJE25UaeG3gDsAH40yUHgXgYXlUmSptyoRfD1qnpFktOBVVX1aJ+hJEnjM+rU0L1JdgAvAr7TYx5J0piNukfwo8CrGUwRvT/JHwI7q+pPe0vWkG9c/+Mrsp1DDz0LWMOhh76+ItvcdO0dxx9K0glvpD2CqvpuVd1UVf8EOB94JuD175J0Ehj5eQRJXpLkN4G9wKkMbjkhSZpyI00NJbkP+CJwE4Orf1fihnOSpBPAqMcInl9Vf9lrEknSRCz1hLKZqtoOvD3JEU8qq6p/3VsySdJYLLVHcFf3fU/fQSRJk7HUoyr/W/fyjqr6whjySJLGbNSzhv5DkruSvC3J83pNJEkaq1GvI3gp8FLgAeC9Se5Y6nkEkqTpMOpZQ1TVHPCuJLcCMwweXL/o8wjUhgvffeGKbOeUh09hFas48PCBFdnmbW+8bQVSSSe/UZ9Q9mNJrktyB/Bu4H8zeAaxJGnKjbpHcCOwE/gHVfUXPeaRJI3ZkkWQZDVwb1X95zHkkSSN2ZJTQ1X1JLAxySljyCNJGrNRp4buBW5Lsgt4+j5DVfXOXlJJksZm1CL48+5rFXBGf3EkSeM2UhFU1a/3HUSSNBmj3ob6VmChm869bMUT6ZitPfUp4FD3XZJGM+rU0FuGXp8KvAY4tPJxdDze8vyHJx1B0hQadWpo77xVtyX5fA95JEljNurU0LOGFlcBW4Eze0kkSRqrUaeG9vLXxwgOAfcBr+8jkCRpvJZ6QtnfBQ5U1bnd8usYHB+4D9jXezpJUu+WurL4vcATAEleDPx74IPAI8COfqNJksZhqamh1VX1UPf6nwE7qupjwMeSfKnXZJKksVhqj2B1ksNl8XLgk0PvjfwsA0nSiWupf8w/AnwqyYPA48BnAJI8h8H0kCRpyi26R1BVbwd+CfgA8FNVdfjMoVXAG5faeJKLk9ydZH+SaxYZ95oklWTr6NElSSthyemdqvrsAuu+ttTPdc8xuAF4JTAL3J5kV1XtmzfuDOBNwOdGDS1JWjkjParyGF0A7K+qe6rqCQZPOLtsgXFvA34D+H89ZpEkHUWfRXAOcGBoebZb97QkLwA2VtV/X2xDSbYl2ZNkzwMPPLDySSWpYX0WwaKSrALeyeAYxKKqakdVba2qrevWres/nCaiTiueOv0p6rQjbnQrqUd9ngJ6ENg4tLyhW3fYGcDzgD9JArAe2JXk0qra02MunaC+d+H3Jh1BalKfRXA7sCXJuQwK4HLgZw+/WVWPAGsPLyf5E+AtkyyBmZkZ5ubmWL9+Pdu3b59UDEkaq96KoKoOJbkauAVYDdxYVXcmuR7YU1W7+vrsYzU3N8fBgweXHihJJ5Ferw6uqt3A7nnrrj3K2Iv6zCJJWtjEDhZLkk4MFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuNOiofL/J23fmhFtnPGg4+yGvjGg4+uyDb3vuOq4w8lST1zj0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMadFNcRrJSnTjn9+75LUgssgiGPbfnpSUeQpLFzakiSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1LheiyDJxUnuTrI/yTULvP/mJPuSfCXJJ5L8cJ95JElH6q0IkqwGbgAuAc4Drkhy3rxhXwS2VtXzgZuB7X3lkSQtrM89gguA/VV1T1U9AewELhseUFW3VtV3u8XPAht6zCNJWkCfzyw+BzgwtDwLvHCR8a8H/sdCbyTZBmwD2LRp00rlkzRhb7/ytSuynYfuf2Twfe6bK7LNX/ndm497G9PkhDhYnORKYCvwjoXer6odVbW1qrauW7duvOGkEZxVxbOqOKtq0lGkZetzj+AgsHFoeUO37vskeQXwK8BLquqveswj9ebKJ5+adATpmPW5R3A7sCXJuUlOAS4Hdg0PSHI+8F7g0qq6v8cskqSj6K0IquoQcDVwC3AXcFNV3Znk+iSXdsPeAfwQ8NEkX0qy6yibkyT1pM+pIapqN7B73rprh16/os/PlyQt7YQ4WCxJmhyLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktS4XosgycVJ7k6yP8k1C7z/A0n+oHv/c0k295lHknSk3oogyWrgBuAS4DzgiiTnzRv2euDbVfUc4D8Cv9FXHknSwvrcI7gA2F9V91TVE8BO4LJ5Yy4DPti9vhl4eZL0mEmSNE+qqp8NJ68FLq6qf9Et/zzwwqq6emjMV7sxs93yn3djHpy3rW3Atm7xucDdvYQeWAs8uOSoE5f5J2eas4P5J63v/D9cVesWemNNjx+6YqpqB7BjHJ+VZE9VbR3HZ/XB/JMzzdnB/JM2yfx9Tg0dBDYOLW/o1i04Jska4EzgWz1mkiTN02cR3A5sSXJuklOAy4Fd88bsAl7XvX4t8Mnqa65KkrSg3qaGqupQkquBW4DVwI1VdWeS64E9VbULeD/w4ST7gYcYlMWkjWUKqkfmn5xpzg7mn7SJ5e/tYLEkaTp4ZbEkNc4ikKTGWQSdpW6HcaJLcmOS+7trM6ZKko1Jbk2yL8mdSd406UzLkeTUJJ9P8uUu/69POtOxSLI6yReT/OGksyxXkvuS3JHkS0n2TDrPciU5K8nNSf4syV1J/t5YP99jBE/fDuNrwCuBWQZnPF1RVfsmGmwZkrwY+A7woap63qTzLEeSs4Gzq+oLSc4A9gL/eFr++3dXw59eVd9J8gzgT4E3VdVnJxxtWZK8GdgKPLOqXj3pPMuR5D5g6/yLUadFkg8Cn6mq93VnWZ5WVQ+P6/PdIxgY5XYYJ7Sq+jSDM6+mTlV9s6q+0L1+FLgLOGeyqUZXA9/pFp/RfU3VX1hJNgD/EHjfpLO0JsmZwIsZnEVJVT0xzhIAi+Cwc4ADQ8uzTNE/RCeT7g605wOfm3CUZemmVb4E3A/8z6qaqvzAfwJmgKcmnONYFfDHSfZ2t6SZJucCDwC/003NvS/J6eMMYBHohJHkh4CPAf+mqv5y0nmWo6qerKqfZHAF/QVJpmZ6Lsmrgfurau+ksxyHn6qqFzC42/EbuqnSabEGeAHwnqo6H3gMGOtxSotgYJTbYahH3dz6x4Dfq6r/Muk8x6rbpb8VuHjCUZbjQuDSbp59J/CyJL872UjLU1UHu+/3Ax9nMN07LWaB2aG9yJsZFMPYWAQDo9wOQz3pDra+H7irqt456TzLlWRdkrO61z/I4KSDP5toqGWoql+uqg1VtZnB//ufrKorJxxrZElO704yoJtS+Wlgas6eq6o54ECS53arXg6M9USJqbj7aN+OdjuMCcdaliQfAS4C1iaZBX6tqt4/2VQjuxD4eeCObp4d4N9W1e7JRVqWs4EPdmefrQJuqqqpOwVzij0b+Hj3KJM1wO9X1R9NNtKyvRH4ve4P0XuAXxjnh3v6qCQ1zqkhSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQTSUSR5srub5VeTfDTJace5vc3TeHdYnfwsAunoHq+qn+zu5voE8C9H+aEkXp+jqWIRSKP5DPCcJP8oyee6m4P9ryTPBkhyXZIPJ7mNwXO4n53k490zCr6c5O9321md5Le75xb8cXclsjRRFoG0hO4v/EuAOxg8a+BF3c3BdjK4Y+dh5wGvqKorgHcBn6qqn2Bw35jDV6pvAW6oqr8NPAy8Ziy/hLQId2Glo/vBoVtefIbB/ZCeC/xB9zCdU4B7h8bvqqrHu9cvA66CwZ1JgUeS/A3g3qo6vM29wOY+fwFpFBaBdHSPd7eWflqSdwPvrKpdSS4Crht6+7ERtvlXQ6+fBJwa0sQ5NSQtz5n89S3KX7fIuE8A/wqefmjNmX0Hk46VRSAtz3XAR5PsBRZ7Pu6bgJcmuYPBFNB5Y8gmHRPvPipJjXOPQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDXOIpCkxv1/VmXu/3DGl3EAAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.barplot(x=\"Parch\", y=\"Survived\", data=train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"5)从不同生还情况的密度图可以看出在年龄15岁的左侧生还率有明显差别密度图非交叉区域面积非常大但在其他年龄段则差别不是很明显认为是随机所致因此可以考虑将此年龄偏小的区域分离出来。"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(12.359751157407416, 0.5, 'density')"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 483.875x216 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"facet = sns.FacetGrid(train, hue=\"Survived\",aspect=2)\n",
"facet.map(sns.kdeplot,'Age',shade= True)\n",
"facet.set(xlim=(0, train['Age'].max()))\n",
"facet.add_legend()\n",
"plt.xlabel('Age') \n",
"plt.ylabel('density') "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"6)Embarked登港港口与生存情况的分析 结果分析:C地的生存率更高,这个也应该保留为模型特征."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='Embarked', ylabel='count'>"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.countplot('Embarked',hue='Survived',data=train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"7)Title Feature(New):不同称呼的乘客幸存率不同"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"新增Title特征从姓名中提取乘客的称呼归纳为六类。"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"all_data['Title'] = all_data['Name'].apply(lambda x:x.split(',')[1].split('.')[0].strip())\n",
"Title_Dict = {}\n",
"Title_Dict.update(dict.fromkeys(['Capt', 'Col', 'Major', 'Dr', 'Rev'], 'Officer'))\n",
"Title_Dict.update(dict.fromkeys(['Don', 'Sir', 'the Countess', 'Dona', 'Lady'], 'Royalty'))\n",
"Title_Dict.update(dict.fromkeys(['Mme', 'Ms', 'Mrs'], 'Mrs'))\n",
"Title_Dict.update(dict.fromkeys(['Mlle', 'Miss'], 'Miss'))\n",
"Title_Dict.update(dict.fromkeys(['Mr'], 'Mr'))\n",
"Title_Dict.update(dict.fromkeys(['Master','Jonkheer'], 'Master'))"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='Title', ylabel='Survived'>"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAVUUlEQVR4nO3de7hddX3n8fcnQYrIbZD0SYeLYTBWGWSgTdGRVqmiA50K0xYtiLUqI48zolMVMszoUMQ62lBrq4IaW0XpCEXtJTAZaQcZ7IOAJOUODxouSlLPGG7KzdHId/7YK7g5OTlnJ9lr73POer+e5zxn3fba3182nM9ev7XWb6WqkCR114JxFyBJGi+DQJI6ziCQpI4zCCSp4wwCSeq4ncZdwLbaZ599asmSJeMuQ5LmlLVr195fVYumWjfngmDJkiWsWbNm3GVI0pyS5NtbW2fXkCR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkd11oQJPlMku8luXUr65Pko0nWJbk5yS+0VYskaevaPCK4ADhmmvXHAkubn1OBT7RYiyRpK1q7oayqvpZkyTSbHA98vnoPRLg2yV5Jfq6qvttWTZLmhuXLlzMxMcHixYtZsWLFuMuZ98Z5Z/G+wH198+ubZVsEQZJT6R01cMABB4ykOEnjMzExwYYNG8ZdRmfMiZPFVbWyqpZV1bJFi6YcKkOStJ3GGQQbgP375vdrlkmSRmicQbAKeENz9dCLge97fkCSRq+1cwRJLgKOAvZJsh74feAZAFX1SWA18GvAOuBx4E1t1SJJ2ro2rxo6aYb1BbytrfeXJA1mTpwsliS1xyCQpI4zCCSp4wwCSeo4g0CSOs4gkKSOMwgkqeMMAknqOINAkjrOIJCkjjMIJKnjDAJJ6jiDQJI6ziCQpI4zCCSp4wwCSeo4g0CSOs4gkKSOMwgkqeMMAknqOINAkjrOIJCkjttp3AVofJYvX87ExASLFy9mxYoV4y5H0pgYBB02MTHBhg0bxl2GpDGza0iSOs4gkKSOMwgkqeMMAknqOINAkjrOIJCkjjMIJKnjDAJJ6rhWgyDJMUnuTLIuyZlTrD8gyZVJbkhyc5Jfa7MeSdKWWguCJAuB84BjgYOBk5IcPGmz9wKXVNXhwInA+W3VI0maWptHBEcA66rq7qr6EXAxcPykbQrYo5neE/inFuuRJE2hzSDYF7ivb359s6zf2cDrk6wHVgNvn2pHSU5NsibJmo0bN7ZRqyR11rgHnTsJuKCqPpzkXwMXJjmkqp7s36iqVgIrAZYtW1ZjqHOsvnPOC1vZ76YH9wZ2YtOD3x76exxw1i1D3Z+k9rR5RLAB2L9vfr9mWb9TgEsAquoaYBdgnxZrkiRN0mYQXA8sTXJgkp3pnQxeNWmb7wCvAEjyAnpBYN+PJI1Qa0FQVZuA04DLgTvoXR10W5JzkhzXbPZu4C1JbgIuAt5YVZ3r+pGkcWr1HEFVraZ3Erh/2Vl907cDR7ZZgyRpeuM+WSy1xkdxSoMxCDRv+ShOaTCONSRJHWcQSFLH2TUkaYd8/N2XDn2fD9//2FO/h73/0z786qHubz7wiECSOs4gkKSOMwgkqeMMAknqOINAkjrOIJCkjvPy0Q7bZ5cngU3Nb0ldZRB02OmHPjzuErSdHEdJw2QQSHOQ4yhpmDxHIEkd5xGBxu7Ij7XzSIqdH96ZBSzgvofvG/p7XP32q4e6P2mcPCKQpI4zCCSp4wwCSeo4g0CSOs4gkKSOMwgkqeMMAknqOINAkjrOIJCkjjMIJKnjDAJJ6jiDQJI6ziCQpI5z9FHNW7Vr8SRPUrvWuEuRZjWDQPPWj4/88bhLkOYEu4YkqeNaDYIkxyS5M8m6JGduZZvXJrk9yW1JvtBmPZKkLU3bNZTkEWCrHaxVtcc0r10InAe8ElgPXJ9kVVXd3rfNUuC/AEdW1UNJfnYb65ck7aBpg6CqdgdI8n7gu8CFQICTgZ+bYd9HAOuq6u5mHxcDxwO3923zFuC8qnqoeb/vbUcbJEk7YNCuoeOq6vyqeqSqflBVn6D3R306+wL39c2vb5b1ex7wvCRXJ7k2yTED1iNJGpJBg+CxJCcnWZhkQZKTgceG8P47AUuBo4CTgE8n2WvyRklOTbImyZqNGzcO4W0lSZsNGgSvA14L/N/m5zXNsulsAPbvm9+vWdZvPbCqqn5cVfcA36QXDE9TVSurallVLVu0aNGAJUuSBjHQfQRVdS8zdwVNdj2wNMmB9ALgRLYMj7+hdyTw2ST70Osqunsb30eStAMGOiJI8rwkVyS5tZk/NMl7p3tNVW0CTgMuB+4ALqmq25Kck+S4ZrPLgQeS3A5cCZxRVQ9sb2MkSdtu0DuLPw2cAXwKoKpubq75/4PpXlRVq4HVk5ad1TddwLuaH0nSGAx6jmDXqvrGpGWbhl2MJGn0Bg2C+5McRHNzWZIT6N1XIEma4wbtGnobsBJ4fpINwD30biqTJM1xgwbBt6vq6CTPAhZU1SNtFiXNJ1e99GVD3+cTOy2EhCfWr29l/y/72lVD36dmr0G7hu5JshJ4MfBoi/VIkkZs0CB4PvC/6XUR3ZPk40l+ub2yJEmjMlAQVNXjVXVJVf0mcDiwB+CxoyTNAwM/jyDJy5KcD6wFdqE35IQkaY4b6GRxknuBG4BL6N39O4wB5yRJs8CgVw0dWlU/aLUSSdJYzPSEsuVVtQL4QJItnlRWVe9orTJJ0kjMdERwR/N7TduFSJLGY6ZHVV7aTN5SVf84gnokSSM26FVDH05yR5L3Jzmk1YokSSM16INpfjXJYnqXjH4qyR7AX1bVtMNQS5K2tHz5ciYmJli8eDErVqwYdzmD30dQVRNV9VHgrcCNwFnTv0KSNJWJiQk2bNjAxMTEuEsBBn9C2QuSnJ3kFuBjwNfpPYNYkjTHDXofwWeAi4F/U1X/1GI9kqQRmzEIkiwE7qmqPx1BPZKkEZuxa6iqfgLsn2TnEdQjSRqxQbuG7gGuTrIKeGqcoar641aqkiSNzKBBcFfzswDYvb1yJEmjNuh9BO9ruxBJ0ngMOgz1lcBUg869fOgVSZJGatCuodP7pncBfgvYNPxyJEmjNmjX0NpJi65O8o0W6pEkjdigXUN7980uAJYBe7ZSkSRppAbtGlrLT88RbALuBU5poyBJ0mjN9ISyXwLuq6oDm/nfpXd+4F7g9tarkyS1bqY7iz8F/AggyUuBDwKfA74PrGy3NEnSKMzUNbSwqh5spn8bWFlVXwa+nOTGViuTJI3ETEcEC5NsDotXAF/tWzfo+QVJ0iw20x/zi4CrktwPPAH8A0CS59LrHpIkzXHTHhFU1QeAdwMXAL9cVZuvHFoAvH2mnSc5JsmdSdYlOXOa7X4rSSVZNnjpkqRhmLF7p6qunWLZN2d6XfMcg/OAVwLrgeuTrKqq2ydttzvwn4DrBi1a6rq9mu9ke9UWI79I26zNfv4jgHVVdTdAkouB49nystP3A38InNFiLdK88vqfPDnuEjSPDPzw+u2wL3Bf3/z6ZtlTkvwCsH9V/c/pdpTk1CRrkqzZuHHj8CuVpA5rMwimlWQB8Mf0zkFMq6pWVtWyqlq2aNGi9ouTpA5pMwg2APv3ze/XLNtsd+AQ4P8kuRd4MbDKE8aSNFptBsH1wNIkBzbPOz4RWLV5ZVV9v6r2qaolVbUEuBY4rqrWtFiTJGmS1oKgqjYBpwGXA3cAl1TVbUnOSXJcW+8rSdo2rd4dXFWrgdWTlp21lW2ParMWSdLUxnayWJI0OxgEktRxBoEkdZxBIEkdZxBIUscZBJLUcQaBJHWcQSBJHWcQSFLHGQSS1HEGgSR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkdZxBIUscZBJLUcQaBJHVcq88slqS57gOvP2Ho+3zwe9/v/Z747tD3/56/+NI2v8YjAknqOINAkjrOIJCkjjMIJKnjDAJJ6jiDQJI6zstHJc06z9p5j6f9VrsMAkmzzpEH/ea4S+gUu4YkqeMMAknqOINAkjrOIJCkjms1CJIck+TOJOuSnDnF+ncluT3JzUmuSPKcNuuRJG2ptSBIshA4DzgWOBg4KcnBkza7AVhWVYcCXwJWtFWPJGlqbR4RHAGsq6q7q+pHwMXA8f0bVNWVVfV4M3stsF+L9UiSptBmEOwL3Nc3v75ZtjWnAP9rqhVJTk2yJsmajRs3DrFESdKsOFmc5PXAMuDcqdZX1cqqWlZVyxYtWjTa4iRpnmvzzuINwP598/s1y54mydHAe4CXVdX/a7GebbZ8+XImJiZYvHgxK1Z4+kLS/NRmEFwPLE1yIL0AOBF4Xf8GSQ4HPgUcU1Xfa7GW7TIxMcGGDVtklyTNK611DVXVJuA04HLgDuCSqrotyTlJjms2OxfYDfhikhuTrGqrHknS1FoddK6qVgOrJy07q2/66DbfX5I0s1lxsliSND4GgSR1nEEgSR03bx5M84tnfH7o+9z9/kdYCHzn/keGvv+1575hqPuTpO3lEYEkdZxBIEkdZxBIUscZBJLUcQaBJHWcQSBJHWcQSFLHzZv7CNrw5M7PetpvSZqPDIJpPLb0VeMuQZJaZ9eQJHWcQSBJHWcQSFLHGQSS1HEGgSR1nEEgSR1nEEhSxxkEktRxBoEkdZxBIEkdZxBIUscZBJLUcQ46J0kjtsvCBU/7PW4GgSSN2OHP3n3cJTzN7IgjSdLYGASS1HEGgSR1nEEgSR1nEEhSxxkEktRxrQZBkmOS3JlkXZIzp1j/M0n+sll/XZIlbdYjSdpSa0GQZCFwHnAscDBwUpKDJ212CvBQVT0X+Ajwh23VI0maWptHBEcA66rq7qr6EXAxcPykbY4HPtdMfwl4RZK0WJMkaZJUVTs7Tk4Ajqmqf9/M/w7woqo6rW+bW5tt1jfzdzXb3D9pX6cCpzazPw/c2UrRU9sHuH/GreYu2zd3zee2ge0btudU1aKpVsyJISaqaiWwchzvnWRNVS0bx3uPgu2bu+Zz28D2jVKbXUMbgP375vdrlk25TZKdgD2BB1qsSZI0SZtBcD2wNMmBSXYGTgRWTdpmFfC7zfQJwFerrb4qSdKUWusaqqpNSU4DLgcWAp+pqtuSnAOsqapVwJ8DFyZZBzxILyxmm7F0SY2Q7Zu75nPbwPaNTGsniyVJc4N3FktSxxkEktRxBkGfJJXkL/rmd0qyMcll46xrR8zHNsHM7Upy3FTDmswmw/xskuyV5D8Ot8Ltl+QnSW5McmuSS5PsNeT9X9Dcq0SS30uy6zD3vw117Jfkb5N8K8ldSf60uTiGJBcluTnJO5M8v/n3uCHJQUm+Po56t8YgeLrHgEOSPLOZfyVbXvIKPHW561wwH9sEM7SrqlZV1YfGUtngBv5sBrAXsE1BkJ62/gY8UVWHVdUh9C4EeVtL7wPwe8DIg6AZBeGvgL+pqqXA84DdgA8kWQz8UlUdWlUfAf4d8KWqOryq7qqqlwzh/Yf2/6tBsKXVwL9tpk8CLtq8IsnZSS5McjVw4TiK204DtynJv0zyjebby81Jlo6j4AFN1643Jvl4M/2a5pvpTUm+1iybLe2crg1HJLmm+Rb59SQ/3yyfqvYPAQc1y85ttjsjyfXNNu9rli1JbyDIzwO38vR7fdpyDbBv8/6HJbm2qemvk/yz5hvyP/a1e+nm+SRnNW24NcnK5o8vfdu+A/jnwJVJrkzy5iR/0rf+LUk+0lK7Xg78sKo+C1BVPwHeCbwZ+Bqwb/N5/D69sPoPSa5s6nq0r8b/nOSW5r/PDzXLDkrylSRrk/xDkuc3yy9I8skk1wErhtaSqvKn+QEeBQ6lN+7RLsCNwFHAZc36s4G1wDPHXWtbbQI+BpzcTO88W9s6QLveCHy8mb4F2LeZ3mu2tHOANuwB7NRMHw18eWu1A0uAW/v2/Sp6lyeG3he+y4CXNts9Cby47bY1vxcCX6Q3lAzAzcDLmulzgD9ppq8EDmum/zvw9mZ67759Xgi8upm+ADihmb4X2KeZ3g24C3hGM/914IUttfEdwEemWH5D87n2fx5nA6dP8e9zbFPjrv3tBa4AljbTL6J3j9Xmdl8GLBxmW+ZSV8BIVNXN6Q2HfRK9b2uTraqqJ0Zb1Y7ZxjZdA7wnyX7AX1XVt0ZU5jYboF2bXQ1ckOQSeofyMEvaOUMb9gQ+13zjL+AZzfItas+WYzW+qvm5oZnfDVgKfAf4dlVdO+y2TPLMJDfSOxK4A/j7JHvSC+Krmm0+Ry8kAP4MeFOSdwG/TW/QSoBfTbKcXtfP3sBtwKVbe9OqejTJV4FfT3IHvUC4ZbhNG6qjgc9W1eMAVfVgkt2AlwBf7Ptcf6bvNV+s3tHH0Ng1NLVVwB/Rd5je57ER1zIsA7Wpqr4AHAc8AaxO8vLRlLfdpmsXAFX1VuC99LpB1iZ59ixr59ba8H7gyur1s7+a3lHDoJ9RgA9Wr5/+sKp6blX9ebNuFP8NP1FVhwHPaWqZ6RzBl+l9O/51YG1VPZBkF+B8et/8Xwh8mubfYAZ/Ru+I8E3AZ7er+sHcDvxi/4IkewAHAJt2YL8LgIf7PrvDquoFfeuH/vkZBFP7DPC+Wf5NYlsN1KYk/wK4u6o+CvwtvUPc2WzGdiU5qKquq6qzgI3A/rOsnVtrw5789OTxGzcv3ErtjwC79732cuDNzbdLkuyb5GfbKX/rmm+67wDeTe8P2ENJfqVZ/TvAVc12P2xq/gQ//eO9+Y/+/U07TtjK2zyt7VV1Hb3Qfx3TfEEYgiuAXZO8AZ56BsuH6XXfPD7gPv6e3pHQrs0+9q6qHwD3JHlNsyxJ/tWwi+9nEEyhqtY3/5PNG9vQptcCtzaH9YcAn2+1sB00YLvObU7G3UqvP/YmZlE7p2nDCuCDSW7g6cPBbFF7VT0AXN2cVD23qv4O+AJwTZJb6J2H2J0xqKob6J0bOIne2GLnJrkZOIzeeYLN/ge98xd/17zuYXpHAbfSC4nrt/IWK4GvbD4R27gEuLqqHhpaQyapXqf9bwCvSfIt4JvAD4H/ug37+Aq9I8I1zed5erPqZOCUJDfR6w6b/CyXoXKICUmzQpLTgT2r6r8NYV+X0TuRe8WOVzb/ebJY0tgl+WvgIHqXZO7IfvYCvgHcZAgMziMCSeo4zxFIUscZBJLUcQaBJHWcQSDNIMmzmzFjbkwykWRDM/1okvObbY5K8pK+15zdXAUjzXpeNSTNoLlG/zDo/YGnN07MH03a7Ch6YwfNquGFpUF4RCBtp+Yo4LJmrKC3Au9sjhR+ZdJ2U44kKc0WHhFIO6iq7k3ySfqOFJK8om+TlcBbm8HhXkRv/JzZPoaTOsQgkFo0wEiS0tgZBFK7nhpJctyFSFvjOQJpOCaP/gnAOEaSlLaVQSANx6XAb0x1spgRjyQpbSvHGpKkjvOIQJI6ziCQpI4zCCSp4wwCSeo4g0CSOs4gkKSOMwgkqeP+P8IdZSWI2JC4AAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"all_data['Title'] = all_data['Title'].map(Title_Dict)\n",
"sns.barplot(x=\"Title\", y=\"Survived\", data=all_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"8)FamilyLabel Feature(New)家庭人数为2到4的乘客幸存率较高"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"新增FamilyLabel特征先计算FamilySize=Parch+SibSp+1然后把FamilySize分为三类。"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='FamilySize', ylabel='Survived'>"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAT5UlEQVR4nO3df7RdZX3n8fcnQYr8EKeTjFEChqX4g2UtaIq0dIkKOtFamLZaQS2lQ2VcS6wdxQyOHUZx2U5ja8dpqctU0Y6lMAjtrFSZQhdQHZmqBESQUJzIz0RvAZEfIiOEfOePs0OPN5fcw+Xuc+7N836tddc5e+9n7/P1Ss7nPs/e+9mpKiRJ7Voy6QIkSZNlEEhS4wwCSWqcQSBJjTMIJKlxe0y6gCdq2bJltWrVqkmXIUmLytVXX313VS2faduiC4JVq1axcePGSZchSYtKktseb5tDQ5LUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGLbobyqS1a9cyNTXFihUrWLdu3aTLkRY9g0CLztTUFFu3bp10GdJuw6EhSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY3rNQiSrElyU5LNSc6YYftBSa5I8vUk1yV5XZ/1SJJ21lsQJFkKnA28FjgUODHJodOa/Q5wQVUdDpwA/Glf9UiSZtZnj+AIYHNV3VxVDwPnA8dPa1PA07r3+wPf6bEeSdIM+px99ADgjqHlLcDLprX5AHBpkncC+wDH9liPJGkGkz5ZfCLwmapaCbwO+GySnWpKcmqSjUk23nXXXWMvUpJ2Z30GwVbgwKHlld26YacAFwBU1T8AewHLph+oqtZX1eqqWr18+fKeypWkNvUZBFcBhyQ5OMmeDE4Gb5jW5nbgGIAkL2QQBP7JL0lj1FsQVNU24DTgEuBGBlcH3ZDkrCTHdc3eA7wtyTeA84CTq6r6qkmStLNeH1VZVRcDF09bd+bQ+03AUX3WIEnatUmfLJYkTZhBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTG9Tr7qLQrX3z50XPa76E9lkLCQ1u2zOkYR3/pi3P6XGl3ZY9AkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIa553F2qW1a9cyNTXFihUrWLdu3aTLkdQDg0C7NDU1xdatWyddhqQeOTQkSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXFOMbGAOK+PpEkwCBYQ5/WRNAkODUlS4wwCSWqcQSBJjes1CJKsSXJTks1JznicNr+aZFOSG5L8ZZ/1SJJ21tvJ4iRLgbOBVwNbgKuSbKiqTUNtDgHeBxxVVd9P8q/6qkeSNLM+ewRHAJur6uaqehg4Hzh+Wpu3AWdX1fcBqurOHuuRJM2gzyA4ALhjaHlLt27Y84DnJbkyyVeSrJnpQElOTbIxyca77rqrp3IlqU2TPlm8B3AI8ArgRODPkjx9eqOqWl9Vq6tq9fLly8dboSTt5voMgq3AgUPLK7t1w7YAG6rqkaq6BfgWg2CQJI1Jn0FwFXBIkoOT7AmcAGyY1uZ/MugNkGQZg6Gim3usSdKErV27lpNOOom1a9dOuhR1ertqqKq2JTkNuARYCpxTVTckOQvYWFUbum2vSbIJeBR4b1V9r6+aJE2eU6ksPL3ONVRVFwMXT1t35tD7At7d/ahHR/3xUXPab89792QJS7jj3jvmdIwr33nlnD5X0vhM+mSxJGnCDAJJapxBIEmNMwgkqXEGgSQ1zieU9eD2s35qTvttu+cngT3Yds9tczrGQWdeP6fPldQ2g0CLztOrfuxV0pNjEGjReeuj2yddgrRb8RyBJDXOIJCkxhkEktS4XZ4jSPIA8Lhn5KrqafNekSRprHYZBFW1H0CSDwHfBT4LBHgL8Mzeq5Mk9W7UoaHjqupPq+qBqrq/qj7Ozs8fliQtQqMGwYNJ3pJkaZIlSd4CPNhnYZKk8Rg1CN4M/CrwT93PG7t1kqRFbqQbyqrqVhwKkqTd0kg9giTPS3JZkm92yy9O8jv9liZJGodRh4b+DHgf8AhAVV3H4GH0kqRFbtQg2LuqvjZt3bb5LkaSNH6jBsHdSZ5Dd3NZkjcwuK9AkrTIjTr76DuA9cALkmwFbmFwU5kkaZEbNQhuq6pjk+wDLKmqB/osSpI0PqMODd2SZD1wJPCDHuvRAlN7F9v32U7t7UNgpN3VqD2CFwCvZzBE9KkknwfOr6ov91aZFoRHjnpk0iVI6tlIPYKq+mFVXVBVvwwcDjwN+GKvlTVo2V7becZTt7FsL5/AJWl8Rn5UZZKjgTcBa4CNDKac0Dw6/cX3TroESQ0aKQiS3Ap8HbgAeG9VOeGcJO0mRu0RvLiq7u+1EknSRMz2hLK1VbUO+HCSnS4bqarf6q0ySdJYzNYjuLF73dh3IZKkyZjtUZV/0729vqquGUM9kqQxG/WGsj9McmOSDyV5Ua8VSZLGatT7CF4JvBK4C/hEkut9HoEk7R5G7RFQVVNV9d+AtwPXAmf2VZQkaXxGfULZC5N8IMn1wB8D/wdY2WtlkqSxGPU+gnOA84F/XVXf6bEeSdKYzRoESZYCt1TVx8ZQjyRpzGYdGqqqR4EDk+w5hnokSWM26tDQLcCVSTYAj80zVFUf3dVOSdYAHwOWAp+sqv/yOO1+BbgQ+Jmq8uY1SRqjUYPg293PEmC/UXbohpTOBl4NbAGuSrKhqjZNa7cf8C7gq6MWLUmaPyMFQVV9cA7HPgLYXFU3AyQ5Hzge2DSt3YeA3wfeO4fPkBaEtWvXMjU1xYoVK1i3bt2ky5GekFGnob4CmGnSuVftYrcDgDuGlrcAL5t23JcAB1bVF5I8bhAkORU4FeCggw4apWRprKampti6deuky5DmZNShodOH3u8F/Aqw7cl8cJIlwEeBk2drW1XrgfUAq1ev9uG5kjSPRh0aunraqiuTfG2W3bYCBw4tr+zW7bAf8CLg75MArAA2JDnOE8aSND6jDg395NDiEmA1sP8su10FHJLkYAYBcALw5h0bq+o+YNnQZ/w9cLohIEnjNerQ0NX88zmCbcCtwCm72qGqtiU5DbiEweWj51TVDUnOAjZW1Ya5lSxJmk+zPaHsZ4A7qurgbvnXGZwfuJWdr/7ZSVVdDFw8bd2Mk9VV1StGqliSNK9mu7P4E8DDAEleDvwe8OfAfXQnbyVJi9tsQ0NLq+qe7v2bgPVVdRFwUZJre61MkjQWs/UIlibZERbHAJcPbRv1/IIkaQGb7cv8POCLSe4GHgL+N0CS5zIYHpIkLXKzPbz+w0kuA54JXFpVO64cWgK8s+/iJEn9m3V4p6q+MsO6b/VTjiRp3EZ+ZrEkaffU7AlfZ4uUpIFmg8DZIiVpwKEhSWpcsz0CSU/OjR++fPZGM3j4nocee53LMV74/l09BkVzYY9Akhpnj0DajXlRhEZhEEi7MS+K0CgcGpKkxhkEktQ4g0CSGrfozxG89L3/fU777Xf3AywFbr/7gTkd4+qPnDSnz5WkhcYegSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNW7R31k8V9v33OfHXiWpVc0GwYOHvGbSJUjSguDQkCQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTG9RoESdYkuSnJ5iRnzLD93Uk2JbkuyWVJnt1nPZKknfUWBEmWAmcDrwUOBU5Mcui0Zl8HVlfVi4ELgXV91SNJmlmfPYIjgM1VdXNVPQycDxw/3KCqrqiqH3aLXwFW9liPJGkGfc4+egBwx9DyFuBlu2h/CvC/ZtqQ5FTgVICDDjpovuqTZvQn7/mbJ7zPvXc/+NjrXPY/7Q9/8QnvI82XBXGyOMlbgdXAR2baXlXrq2p1Va1evnz5eIuTpN1cnz2CrcCBQ8sru3U/JsmxwPuBo6vqRz3WI0maQZ89gquAQ5IcnGRP4ARgw3CDJIcDnwCOq6o7e6xFkvQ4eguCqtoGnAZcAtwIXFBVNyQ5K8lxXbOPAPsCn0tybZINj3M4SVJPen1UZVVdDFw8bd2ZQ++P7fPzJUmzWxAniyVJk2MQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhrX653FkubHh9/6hjntd8+d9w1ep747p2O8/y8unNPnanGxRyBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4Zx+VNFb/cq/9f+xVk2cQSBqr0w5/86RL0DQODUlS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxvUaBEnWJLkpyeYkZ8yw/SeS/I9u+1eTrOqzHknSznoLgiRLgbOB1wKHAicmOXRas1OA71fVc4E/An6/r3okSTPrs0dwBLC5qm6uqoeB84Hjp7U5Hvjz7v2FwDFJ0mNNkqRpUlX9HDh5A7Cmqn6zW/414GVVddpQm292bbZ0y9/u2tw97VinAqd2i88HbpqnMpcBd8/aarysaTTWNLqFWJc1jWY+a3p2VS2facOieFRlVa0H1s/3cZNsrKrV833cJ8OaRmNNo1uIdVnTaMZVU59DQ1uBA4eWV3brZmyTZA9gf+B7PdYkSZqmzyC4CjgkycFJ9gROADZMa7MB+PXu/RuAy6uvsSpJ0ox6Gxqqqm1JTgMuAZYC51TVDUnOAjZW1QbgU8Bnk2wG7mEQFuM078NN88CaRmNNo1uIdVnTaMZSU28niyVJi4N3FktS4wwCSWpck0GQ5Jwkd3b3MUxckgOTXJFkU5Ibkrxr0jUBJNkrydeSfKOr64OTrmmHJEuTfD3J5yddC0CSW5Ncn+TaJBsnXQ9AkqcnuTDJPya5McnPLoCant/9jnb83J/ktxdAXf+++2/8m0nOS7LXBGrY6XspyRu7urYn6e0y0iaDAPgMsGbSRQzZBrynqg4FjgTeMcN0HJPwI+BVVfXTwGHAmiRHTrakx7wLuHHSRUzzyqo6bAFdi/4x4G+r6gXAT7MAfl9VdVP3OzoMeCnwQ+CvJ1lTkgOA3wJWV9WLGFzcMu4LV2Dm76VvAr8MfKnPD24yCKrqSwyuUloQquq7VXVN9/4BBv9gD5hsVVADP+gWn9L9TPzqgiQrgV8APjnpWhaqJPsDL2dwZR5V9XBV3TvRonZ2DPDtqrpt0oUwuILyqd39THsD3xl3ATN9L1XVjVU1XzMpPK4mg2Ah62ZgPRz46oRLAR4bgrkWuBP4u6paCHX9V2AtsH3CdQwr4NIkV3dTokzawcBdwKe7IbRPJtln0kVNcwJw3qSLqKqtwB8AtwPfBe6rqksnW9V4GQQLSJJ9gYuA366q+yddD0BVPdp141cCRyR50STrSfJ64M6qunqSdczg56vqJQxm231HkpdPuJ49gJcAH6+qw4EHgZ2mgp+U7ibT44DPLYBa/gWDCTAPBp4F7JPkrZOtarwMggUiyVMYhMC5VfVXk65num5Y4Qomf27lKOC4JLcymNH2VUn+YrIlPfZXJVV1J4Mx7yMmWxFbgC1DPbgLGQTDQvFa4Jqq+qdJFwIcC9xSVXdV1SPAXwE/N+GaxsogWAC6qbc/BdxYVR+ddD07JFme5Ond+6cCrwb+cZI1VdX7qmplVa1iMLRweVVN9K+3JPsk2W/He+A1DE7yTUxVTQF3JHl+t+oYYNMES5ruRBbAsFDnduDIJHt3/xaPYQGcWB+nJoMgyXnAPwDPT7IlySkTLuko4NcY/HW747K61024JoBnAlckuY7B3FF/V1UL4nLNBeYZwJeTfAP4GvCFqvrbCdcE8E7g3O7/v8OA351sOQNdWL6awV/eE9f1mi4ErgGuZ/C9OPbpJmb6XkryS0m2AD8LfCHJJb18tlNMSFLbmuwRSJL+mUEgSY0zCCSpcQaBJDXOIJCkxhkEakKSR6fNernqSR7vuCRndO8/kOT0Wdq/vpvq4RvdLLP/rlv/9iQnPZlapCfLy0fVhCQ/qKp9ezr2B4AfVNUfPM72pwC3AUdU1ZYkPwGsGsdkYtIo7BGoSUn2TXJZkmu65wgc361f1c3f/5kk30pybpJjk1yZ5P8mOaJrd3KSP5l2zOckuWZo+ZBueT8Gc/98D6CqfrQjBHb0JpI8a1qP5dEkz+7u7r4oyVXdz1Fj+hWpIb09vF5aYJ7azaIKcAvwRuCXqur+JMuAryTZ0G1/brf93zK4o/rNwM8zmCTtPwL/ZqYPqKpvJ7kvyWFVdS3wG8Cnq+qe7ti3JbkM+DxwXlVtH9r3Owzu/iXJO4Cjq+q2JH8J/FFVfTnJQcAlwAvn5TcidQwCteKhbhZV4LHhmt/tZgndzuD5D8/oNt9SVdd37W4ALquqSnI9sGqWz/kk8BtJ3g28iW7yuar6zSQ/xWCCs9MZTLFw8vSdu7/438YgeOjaHzqYAgeApyXZd+g5EdKTZhCoVW8BlgMvrapHutlMdzye8EdD7bYPLW9n9n8zFwH/GbgcuLqqvrdjQxcu1yf5LINeycnDOyZ5JoPJB48b+qJfAhxZVf/vCf2vk54AzxGoVfszeK7BI0leCTx7Pg7afWFfAnwc+DQ8dj7iFUPNDmNw8vgxXQ/lc8B/qKpvDW26lMHkcTvaHTYfdUrDDAK16lxgdTfccxLzO732uQx6DzuechVgbZKbuvMUH2TnYaGfA1YDHxw6YfwsumfpJrkuySbg7fNYpwR4+ag077p7Cvavqv806VqkUXiOQJpHSf4aeA7wqknXIo3KHoEkNc5zBJLUOINAkhpnEEhS4wwCSWqcQSBJjfv/cruOLvog3h4AAAAASUVORK5CYII=\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"all_data['FamilySize']=all_data['SibSp']+all_data['Parch']+1\n",
"sns.barplot(x=\"FamilySize\", y=\"Survived\", data=all_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"按生存率把FamilySize分为三类构成FamilyLabel特征。"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='FamilyLabel', ylabel='Survived'>"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAR6UlEQVR4nO3df5BdZ13H8fcnKRGBgoNdbacJpAOpmMEKskTHMvyyaKsziQJqCyp10IwzBtAKnaJYtQw6xgFHJSBRqsgAsYA6q8SpWvBXtZBtrdQkFEJaTCKhS8tvKG3I1z/2BK/bm92bds/e3Tzv18yd3Oc5zz33m7nJfvY8557npKqQJLVr1bgLkCSNl0EgSY0zCCSpcQaBJDXOIJCkxp0x7gJO1VlnnVXr168fdxmStKLcfPPNn66qiWHbVlwQrF+/nunp6XGXIUkrSpJPnGybU0OS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxq24C8okabFdeeWVHD16lLPPPpvt27ePu5wlZxBIat7Ro0c5cuTIuMsYG6eGJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqXK9BkOTiJLcnOZDkqpOM+bEk+5LsTfLOPuuRJD1QbxeUJVkN7ACeBxwG9iSZqqp9A2M2AK8GLqyqzyT5lr7qkSQN1+cRwSbgQFUdrKr7gF3AljljfhbYUVWfAaiqu3qsR5I0RJ9BcC5waKB9uOsbdD5wfpIbk9yU5OJhO0qyNcl0kumZmZmeypWkNo37ZPEZwAbg2cBlwB8l+aa5g6pqZ1VNVtXkxMTE0lYoSae5PoPgCLBuoL226xt0GJiqqvur6g7go8wGgyRpifQZBHuADUnOS7IGuBSYmjPmr5g9GiDJWcxOFR3ssSZJ0hy9BUFVHQO2AdcD+4HrqmpvkmuSbO6GXQ/cnWQf8AHgVVV1d181SZIeqNf7EVTVbmD3nL6rB54XcEX3kCSNwbhPFkuSxsw7lEladBf+wYXjLuGUrPnsGlaxikOfPbSiar/xZTcuyn48IpCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIa5z2LJTWvHlEc5zj1iBp3KWNhEEhq3v0X3j/uEsaq16mhJBcnuT3JgSRXDdl+eZKZJLd2j5/psx5J0gP1dkSQZDWwA3gecBjYk2SqqvbNGfrnVbWtrzokSfPr84hgE3Cgqg5W1X3ALmBLj+8nSXoQ+gyCc4FDA+3DXd9cL0jy4STvSbJu2I6SbE0ynWR6Zmamj1olqVnj/vroXwPrq+oC4O+Btw0bVFU7q2qyqiYnJiaWtEBJOt31GQRHgMHf8Nd2fV9XVXdX1Ve75h8DT+uxHknSEH0GwR5gQ5LzkqwBLgWmBgckOWeguRnY32M9kqQhevvWUFUdS7INuB5YDVxbVXuTXANMV9UU8PIkm4FjwD3A5X3VI0kartcLyqpqN7B7Tt/VA89fDby6zxokSfMb98liSdKYGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNa7XIEhycZLbkxxIctU8416QpJJM9lmPJOmBeguCJKuBHcAlwEbgsiQbh4w7E3gF8MG+apEknVyfRwSbgANVdbCq7gN2AVuGjHst8NvAvT3WIkk6iT6D4Fzg0ED7cNf3dUm+C1hXVe+bb0dJtiaZTjI9MzOz+JVKUsPGdrI4ySrgDcAvLTS2qnZW1WRVTU5MTPRfnCQ1pM8gOAKsG2iv7fpOOBN4MvCPSe4EvgeY8oSxJC2tPoNgD7AhyXlJ1gCXAlMnNlbV56rqrKpaX1XrgZuAzVU13WNNkqQ5zphvY5IvAHWy7VX16Hm2HUuyDbgeWA1cW1V7k1wDTFfV1MleK0laOvMGQVWdCZDktcAngbcDAV4MnLPQzqtqN7B7Tt/VJxn77JEqliQtqlGnhjZX1Zuq6gtV9fmqejPDvwoqSVphRg2CLyV5cZLVSVYleTHwpT4LkyQtjVGD4EXAjwGf6h4/2vVJkla4ec8RnFBVd+JUkCSdlkY6IkhyfpIbkvxX174gyWv6LU2StBRGnRr6I+DVwP0AVfVhZq8LkCStcKMGwSOq6kNz+o4tdjGSpKU3ahB8OskT6C4uS/JCZq8rkCStcCOdLAZ+HtgJPCnJEeAOZi8qkyStcKMGwSeq6qIkjwRWVdUX+ixKkrR0Rp0auiPJTmZXCP1ij/VIkpbYqEHwJOAfmJ0iuiPJG5M8o7+yJElLZaQgqKovV9V1VfV84KnAo4F/6rUySdKSGPl+BEmeleRNwM3Aw5ldckKStMKNdLK4u4PYfwDXAa+qKheck6TTxKjfGrqgqj7fayWSpLFY6A5lV1bVduB1SR5wp7KqenlvlUkryJVXXsnRo0c5++yz2b59+7jLkU7JQkcE+7s/vY+wNI+jR49y5MiRcZchPSgL3aryr7unt1XVLUtQjyRpiY36raHXJ9mf5LVJntxrRZKkJTXqdQTPAZ4DzABvSXKb9yOQpNPDyNcRVNXRqvp94OeAW4Gr+ypKkrR0Rr1D2bcn+fUktwF/APwbsLbXyiRJS2LU6wiuBXYBP1BV/9NjPZKkJbbgEUGS1cAdVfV7pxoCSS5OcnuSA0muGrL957rzDbcm+dckG09l/5Kkh27BIKiqrwHrkqw5lR13AbIDuATYCFw25Af9O6vqO6rqKcB24A2n8h6SpIdu1KmhO4Abk0wBX19nqKrm+8G9CThQVQcBkuwCtgD7Bl4/uGzFI+luhSlJWjqjBsHHu8cq4MwRX3MucGigfRj47rmDkvw8cAWwBnjusB0l2QpsBXjc4x434ttLkkYxUhBU1W/0VUBV7QB2JHkR8BrgJUPG7GT2nslMTk561NCI/77mO8ZdwsiO3fNY4AyO3fOJFVX3466+bdwlaBkYdRnqDzBk2qaqhv4G3zkCrBtor+36TmYX8OZR6pEkLZ5Rp4ZeOfD84cALgGMLvGYPsCHJecwGwKXAiwYHJNlQVR/rmj8EfAxJ0pIadWro5jldNyb50AKvOZZkG3A9sBq4tqr2JrkGmK6qKWBbkouA+4HPMGRaSJLUr1Gnhh470FwFTAKPWeh1VbUb2D2n7+qB568YrUxJUl9GnRq6mf87R3AMuBN4aR8FSZKW1kJ3KHs6cKiqzuvaL2H2/MCdDFwPIElauRa6svgtwH0ASZ4J/BbwNuBzdF/nlCStbAtNDa2uqnu65z8O7Kyq9wLvTXJrr5VJkpbEQkcEq5OcCIvvA94/sG3U8wuSpGVsoR/m7wL+Kcmnga8A/wKQ5InMTg9Jkla4hW5e/7okNwDnAH9XVSe+ObQKeFnfxUmS+rfg9E5V3TSk76P9lCNJWmoj37NYknR68oSvtAjOevhx4Fj3p7SyGATSInjlBZ8ddwnSg+bUkCQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIa12sQJLk4ye1JDiS5asj2K5LsS/LhJDckeXyf9UiSHqi3IEiyGtgBXAJsBC5LsnHOsP8AJqvqAuA9wPa+6pEkDdfnEcEm4EBVHayq+4BdwJbBAVX1gar6cte8CVjbYz2SpCH6DIJzgUMD7cNd38m8FPjbYRuSbE0ynWR6ZmZmEUuUJC2Lk8VJfgKYBH5n2Paq2llVk1U1OTExsbTFSdJprs87lB0B1g2013Z9/0+Si4BfAZ5VVV/tsR5J0hB9HhHsATYkOS/JGuBSYGpwQJKnAm8BNlfVXT3WIkk6id6CoKqOAduA64H9wHVVtTfJNUk2d8N+B3gU8O4ktyaZOsnuJEk96fXm9VW1G9g9p+/qgecX9fn+kqSFLYuTxZKk8TEIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWpcr0GQ5OIktyc5kOSqIdufmeSWJMeSvLDPWiRJw/UWBElWAzuAS4CNwGVJNs4Z9t/A5cA7+6pDkjS/M3rc9ybgQFUdBEiyC9gC7DsxoKru7LYd77EOSdI8+pwaOhc4NNA+3PWdsiRbk0wnmZ6ZmVmU4iRJs1bEyeKq2llVk1U1OTExMe5yJOm00mcQHAHWDbTXdn2SpGWkzyDYA2xIcl6SNcClwFSP7ydJehB6C4KqOgZsA64H9gPXVdXeJNck2QyQ5OlJDgM/Crwlyd6+6pEkDdfnt4aoqt3A7jl9Vw8838PslJEkaUxWxMliSVJ/DAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGtdrECS5OMntSQ4kuWrI9m9I8ufd9g8mWd9nPZKkB+otCJKsBnYAlwAbgcuSbJwz7KXAZ6rqicDvAr/dVz2SpOH6PCLYBByoqoNVdR+wC9gyZ8wW4G3d8/cA35ckPdYkSZrjjB73fS5waKB9GPjuk42pqmNJPgd8M/DpwUFJtgJbu+YXk9zeS8XLw1nM+ftrxVh5n92v+XvXgBX3+eXlp/T5Pf5kG/oMgkVTVTuBneOuYykkma6qyXHXoVPnZ7eytfz59Tk1dARYN9Be2/UNHZPkDOAxwN091iRJmqPPINgDbEhyXpI1wKXA1JwxU8BLuucvBN5fVdVjTZKkOXqbGurm/LcB1wOrgWuram+Sa4DpqpoC3gq8PckB4B5mw6J1TUyBnab87Fa2Zj+/+Au4JLXNK4slqXEGgSQ1ziBYJhZajkPLV5Jrk9yV5L/GXYtOXZJ1ST6QZF+SvUleMe6alprnCJaBbjmOjwLPY/bCuz3AZVW1b6yFaSRJngl8EfizqnryuOvRqUlyDnBOVd2S5EzgZuCHW/r/5xHB8jDKchxapqrqn5n91ptWoKr6ZFXd0j3/ArCf2VUPmmEQLA/DluNo6h+itBx0KyA/FfjgmEtZUgaBJAFJHgW8F/iFqvr8uOtZSgbB8jDKchySepLkYcyGwDuq6i/GXc9SMwiWh1GW45DUg27p+7cC+6vqDeOuZxwMgmWgqo4BJ5bj2A9cV1V7x1uVRpXkXcC/A9+W5HCSl467Jp2SC4GfBJ6b5Nbu8YPjLmop+fVRSWqcRwSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCHTaSvK1ga8D3totH/BQ9rf5xMqwSX49ySsXGP+PSUa6GXqSZyf5m1OsZ+T9S/Pp7VaV0jLwlap6ymLtrLu9qhf66bTjEYGakeRRSW5IckuS25Js6frXJ/lIkj9N8tEk70hyUZIbk3wsyaZu3OVJ3jhnn09IcstAe8Nge0gN65P8S1fDLUm+d2Dzo5O8r7svxR8mWdW95vuT/Hs3/t3dmjjSojEIdDr7xoFpob8E7gV+pKq+C3gO8PpueQGAJwKvB57UPV4EPAN4JfDLJ3uDqvo48LkkT+m6fhr4k3lqugt4XlfDjwO/P7BtE/AyYCPwBOD5Sc4CXgNc1L1mGrhixL+/NBKnhnQ6+39TQ93CYr/Z3UjmOLNLfX9rt/mOqrqtG7cXuKGqKsltwPoF3uePgZ9OcgWzP9w3zTP2YcAbu+D4GnD+wLYPVdXBroZ3MRtE9zIbDDd2mbWG2eUspEVjEKglLwYmgKdV1f1J7gQe3m376sC44wPt4yz8/+S9wK8B7wdurqq75xn7i8CngO9k9oj83oFtc9d7KSDA31fVZQvUID1oTg2pJY8B7upC4DnA4xdjp1V1L7MLBr6Z+aeFTtTwyao6zuxCZ6sHtm3qVqBdxeyRxb8CNwEXJnkiQJJHJjl/7k6lh8IgUEveAUx20z0/BXxkkfd9HPi7Of3v61YkPZzk3cCbgJck+U9mz0V8aWDsHuCNzK5Aewfwl1U1A1wOvCvJh5mdFnrSItYtufqotBi6awoeU1W/Ou5apFPlOQLpIeq+kfQE4LnjrkV6MDwikKTGeY5AkhpnEEhS4wwCSWqcQSBJjTMIJKlx/wuqIqM9oEcfxgAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def Fam_label(s):\n",
" if (s >= 2) & (s <= 4):\n",
" return 2\n",
" elif ((s > 4) & (s <= 7)) | (s == 1):\n",
" return 1\n",
" elif (s > 7):\n",
" return 0\n",
"all_data['FamilyLabel']=all_data['FamilySize'].apply(Fam_label)\n",
"sns.barplot(x=\"FamilyLabel\", y=\"Survived\", data=all_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"9)Deck Feature(New):不同甲板的乘客幸存率不同"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"新增Deck特征先把Cabin空缺值填充为'Unknown'再提取Cabin中的首字母构成乘客的甲板号。"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='Deck', ylabel='Survived'>"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAATKUlEQVR4nO3df7DldX3f8edrF3cXCJDorlmHHy5NligVKslKbLHBnwk6UzCRSUGtNmNDJyO2nairrRlCMIzJ2pAaxcg2MSqJEqLTzGa6lbSBaErEsKiAwOCsgLIbb1lYIQiky8q7f5yz5HD37r2H2/s933P5PB8zd875fr+f8/2+79mz53W/n8/3R6oKSVK7VvRdgCSpXwaBJDXOIJCkxhkEktQ4g0CSGndY3wU8XWvXrq0NGzb0XYYkLSs33XTT/VW1bq5lyy4INmzYwI4dO/ouQ5KWlSTfOtQyu4YkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4zoLgiQfT3Jfkq8fYnmS/E6SnUluSfLjXdUiSTq0LvcIPgGcNc/y1wIbhz8XAL/bYS2SpEPo7ISyqvpikg3zNDkH+FQNbohwQ5IfTPK8qvpOVzXpmWHz5s3MzMywfv16tmzZ0nc50rLX55nFxwL3jkzvGs47KAiSXMBgr4ETTjhhIsVpes3MzLB79+6+y5CeMZbFYHFVba2qTVW1ad26OS+VIUlapD6DYDdw/Mj0ccN5kqQJ6jMItgFvGR499FLgIccHJGnyOhsjSPIZ4OXA2iS7gF8FngVQVR8DtgOvA3YCjwK/0FUtkqRD6/KoofMXWF7A27vaviRpPMtisFiS1B2DQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhrX2a0qpZZs3ryZmZkZ1q9fz5YtW/ouR3paDALNyy+48czMzLB79+6+y1gW/ExNH4NA8/ILTkvNz9T0cYxAkhpnEEhS4wwCSWqcQSBJjXOwWFLzWj+SySCQ1LzWj2Sya0iSGmcQSFLjDAJJapxBIEmNMwgkqXGdBkGSs5LcmWRnkvfOsfyEJNcl+WqSW5K8rst6JEkH6ywIkqwELgdeC5wMnJ/k5FnNfgW4uqpOA84DPtpVPZKkuXW5R3A6sLOq7qqqfcBVwDmz2hRw9PD5McDfdliPJGkOXQbBscC9I9O7hvNGXQy8OckuYDvwjrlWlOSCJDuS7NizZ08XtUpSs/oeLD4f+ERVHQe8DrgyyUE1VdXWqtpUVZvWrVs38SIl6ZmsyyDYDRw/Mn3ccN6otwFXA1TVl4A1wNoOa5IkzdJlENwIbExyYpJVDAaDt81q823gVQBJXsggCOz7kaQJ6iwIqmo/cCFwDXAHg6ODbktySZKzh83eCfxikpuBzwD/uqqqq5okSQfr9OqjVbWdwSDw6LyLRp7fDpzRZQ2SpPn1PVgsSeqZQSBJjTMIJKlx3qFMmuUj7/yzp/2aB+9/5MnHxbz+wt/6F0/7NdJSMQgaccaHFzcmv+rBVaxgBfc+eO+i1nH9O65f1HYlTY5dQ5LUOPcIJC3KHZdeu6jX7dv72JOPi1nHC9/3ykVtV4dmEEjPYJs3b2ZmZob169ezZcuWvsvRlDIIpGewmZkZdu+efYkv6akcI5CkxhkEktQ4u4amiP25kvpgEEwR+3Ml9cGuIUlqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxnd6zOMlZwIeAlcDvVdVvzNHm54GLgQJurqo3dlmTpGeuiy++eFGv27t375OPi1nHYrc7LeYNgiQPM/iCnlNVHT3Pa1cClwOvAXYBNybZVlW3j7TZCPxH4Iyq+m6S5z7N+iUtM89Zc8xTHtW/eYOgqo4CSPJ+4DvAlUCANwHPW2DdpwM7q+qu4TquAs4Bbh9p84vA5VX13eH27lvE7yBpGbnwNHf6p824YwRnV9VHq+rhqvq7qvpdBl/q8zkWuHdketdw3qiTgJOSXJ/khmFXkrTsHLnqaI5c/YMcueqQO8nS1Bp3jOCRJG8CrmLQVXQ+8MgSbX8j8HLgOOCLSU6pqgdHGyW5ALgA4IQTTliCzUpL64wf+bm+S5AWbdwgeCODQd8PMQiC64fz5rMbOH5k+rjhvFG7gC9X1ePA3Um+wSAYbhxtVFVbga0AmzZtOuSYxbT49iWnLOp1+/c+GziM/Xu/tah1nHDRrYvarqS2jRUEVXUPC3cFzXYjsDHJiQwC4DwODo8/ZbB38QdJ1jLoKrrraW5HHaojiid4gjpi6vNX0iKNNUaQ5KQkf5Hk68PpU5P8ynyvqar9wIXANcAdwNVVdVuSS5KcPWx2DfBAktuB64B3V9UDi/1ltPQeP+Nx9r1mH4+f8XjfpUjqyLhdQ/8VeDdwBUBV3ZLk08Cvz/eiqtoObJ8176KR5wX88vBHktSDcY8aOqKq/mbWvP1LXYwkafLGDYL7k/wIw5PLkpzL4LwCSdIyN27X0NsZHLXzgiS7gbsZnFQmSVrmxg2Cb1XVq5McCayoqoe7LEqSNDnjdg3dnWQr8FLgex3WI0masHGD4AXA/2LQRXR3ko8keVl3ZUmSJmWsIKiqR6vq6qr6OeA04GjgC51WJkmaiLFvTJPkzCQfBW4C1gA/31lVkqSJGWuwOMk9wFeBqxmc/bsUF5yTJE2BcY8aOrWq/q7TSiRJvVjoDmWbq2oLcGmSg646VlX/rrPKJEkTsdAewR3Dxx1dFyJJ6sdCt6r8s+HTW6vqKxOoR5I0YeMeNfRbSe5I8v4kL+q0IknSRI17HsErgFcAe4Arkty60P0IJEnLw7hHDVFVM8DvJLkO2AxcxAL3I5Dm84WfOnNRr3vssJWQ8NiuXYtax5lf9FxIadS4dyh7YZKLk9wKfBj4awb3IJYkLXPj7hF8HLgK+Jmq+tsO65EkTdiCQZBkJXB3VX1oAvVIkiZswa6hqvo+cHySVROoR5I0YeN2Dd0NXJ9kG/DkdYaq6rJOqpIkTcy4QfDN4c8K4KjuypEkTdpYQVBVv9Z1IZIO7dI3n7uo1+2976HB48x3FrWO9/3hZxe1XS0v416G+jpgrovOvXLJK5IkTdS4XUPvGnm+BngDsH/py5EkTdq4XUM3zZp1fZK/6aCepq1d8wSwf/goSZMxbtfQs0cmVwCbgGM6qahh7zr1wb5LkNSgcbuGbuIfxgj2A/cAb+uiIEnSZC10h7KXAPdW1YnD6bcyGB+4B7i98+okSZ1b6MziK4B9AEl+CvgA8EngIWBrt6VJkiZhoa6hlVW1d/j8XwJbq+pzwOeSfK3TyiRJE7HQHsHKJAfC4lXAtSPLxr6XgSRpei30Zf4Z4AtJ7gceA/4KIMmPMugekiQtc/PuEVTVpcA7gU8AL6uqA0cOrQDesdDKk5yV5M4kO5O8d552b0hSSTaNX7okaSks2L1TVTfMMe8bC71ueB+Dy4HXALuAG5Nsq6rbZ7U7Cvj3wJfHLVqStHTGulXlIp0O7Kyqu6pqH4M7nJ0zR7v3A78J/H2HtUiSDqHLIDgWuHdketdw3pOS/DhwfFX99/lWlOSCJDuS7NizZ8/SVypJDesyCOaVZAVwGYMxiHlV1daq2lRVm9atW9d9cZLUkC6DYDdw/Mj0ccN5BxwFvAj4yyT3AC8FtjlgLEmT1WUQ3AhsTHLi8H7H5wHbDiysqoeqam1VbaiqDcANwNlVtaPDmiRJs3QWBFW1H7gQuAa4A7i6qm5LckmSs7variQ9XatXr+bwww9n9erVfZfSi07PDq6q7cD2WfMuOkTbl3dZiyQdyimnnNJ3Cb3qbbBYkjQdDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMY1e3OZzZs3MzMzw/r169myZUvf5UhSb5oNgpmZGXbv3r1wQ0l6hrNrSJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxi378wh+4t2fWtTrjrr/YVYC377/4UWt46YPvmVR25WkaeMegSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWrcsj98dLGeWHXkUx4lqVXNBsEjG3+67xIkaSrYNSRJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqXKdBkOSsJHcm2ZnkvXMs/+Uktye5JclfJHl+l/VIkg7WWRAkWQlcDrwWOBk4P8nJs5p9FdhUVacCnwW2dFWPJGluXe4RnA7srKq7qmofcBVwzmiDqrquqh4dTt4AHNdhPZKkOXQZBMcC945M7xrOO5S3Af9jrgVJLkiyI8mOPXv2LGGJkqSpGCxO8mZgE/DBuZZX1daq2lRVm9atWzfZ4qRlbM3KFRy+cgVrVk7Ff3VNqS6vProbOH5k+rjhvKdI8mrgfcCZVfV/O6xHas5pzzmq7xK0DHT5Z8KNwMYkJyZZBZwHbBttkOQ04Arg7Kq6r8NaJEmH0FkQVNV+4ELgGuAO4Oqqui3JJUnOHjb7IPADwJ8k+VqSbYdYnSSpI53emKaqtgPbZ827aOT5q7vcviRpYY4gSVLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4zoNgiRnJbkzyc4k751j+eokfzxc/uUkG7qsR5J0sM6CIMlK4HLgtcDJwPlJTp7V7G3Ad6vqR4HfBn6zq3okSXPrco/gdGBnVd1VVfuAq4BzZrU5B/jk8PlngVclSYc1SZJmSVV1s+LkXOCsqvo3w+l/BfxkVV040ubrwza7htPfHLa5f9a6LgAuGE7+GHDnEpW5Frh/wVaTZU3jsabxTWNd1jSepazp+VW1bq4Fhy3RBjpVVVuBrUu93iQ7qmrTUq/3/4c1jceaxjeNdVnTeCZVU5ddQ7uB40emjxvOm7NNksOAY4AHOqxJkjRLl0FwI7AxyYlJVgHnAdtmtdkGvHX4/Fzg2uqqr0qSNKfOuoaqan+SC4FrgJXAx6vqtiSXADuqahvw+8CVSXYCexmExSQteXfTErCm8VjT+KaxLmsaz0Rq6mywWJK0PHhmsSQ1ziCQpMY1FwRJNgzPXxidd3GSd/VV07CG9UmuSvLNJDcl2Z7kpJ5r+n6Sr438HHSZkB5q+uEkn05y1/B9+lKSn+25pgPv021Jbk7yziS9/99K8vokleQFfdcCT3mfbk7ylST/rO+aYM7P+Yae63nOSC0zSXaPTK/qYpvL4jyCZ7rh2dT/DfhkVZ03nPdPgB8GvtFjaY9V1Yt73P5TDN+nP2XwPr1xOO/5wNl91sXI+5TkucCngaOBX+2zKOB84H8PH/uuBZ76Pv0M8AHgzF4rGpiqz3lVPQC8GAZ/pALfq6r/3OU2e/+rRQC8Ani8qj52YEZV3VxVf9VjTdPolcC+We/Tt6rqwz3W9BRVdR+Ds+Av7PNyKUl+AHgZg+t5TfpovHEcDXy37yI04B7BdHgRcFPfRczh8CRfG5n+QFX9cV/FAP8Y+EqP2x9LVd01vOjic4H/01MZ5wCfr6pvJHkgyU9UVd+fsQOfpzXA8xgE+zQY/ZzfXVW9djX2ocUgONTxsh5He7Cp2mWeLcnlDP7q3VdVL+m7nilzPvCh4fOrhtN9B8Fo19A/BT6V5EVTcBLpVH/OJ6HFIHgA+KFZ854N3N1DLQfcxuDMas3vNuANByaq6u1J1gI7+ivpYEn+EfB94L6etv9sBn9tn5KkGJzQWUnePQVfugBU1ZeG/3br6Ol90j9oboygqr4HfCfJK+HJ/zRnMRhU68u1wOrhVVYBSHJqkn/eY03T6FpgTZJfGpl3RF/FzCXJOuBjwEd6/NI9F7iyqp5fVRuq6ngGf+hMzedpeCTTSry22FRocY8A4C3A5UkuG07/WlV9s69iqqqGh0D+lyTvAf4euAf4D33VNDR7jODzVdXbIaTD9+n1wG8n2QzsAR4B3tNXTUMH3qdnAfuBK4HL5n1Ft87n4Js8fW44/4uTL+dJo5+nAG+tqu/3WI+GvMSEJDWuua4hSdJTGQSS1DiDQJIaZxBIUuMMAklqnEEgzWMpryya5C+TTNXN0SVo9zwCaVzTemVRacm4RyCNafaVRZOsTPLBJDcmuSXJvz3QNsl7ktw63Iv4jdH1JFmR5BNJfn3Sv4M0F/cIpKdh1pVFzwEeqqqXJFkNXJ/kz4EXDJf9ZFU9OryMyQGHAX8EfL2qLp10/dJcDAJp8X4aODXJgQsGHgNsBF4N/EFVPQpQVXtHXnMFcLUhoGli15D0NMy6smiAd1TVi4c/J1bVny+wir8GXpFkTde1SuMyCKQxzXFl0WuAX0ryrOHyk5IcCfxP4BeSHDGcP9o19PvAduDqJO6Rayr4QZTmN9+VRX8P2AB8ZXhbyj3A66vq80leDOxIso/BF/9/OrDCqrosyTHAlUneVFVPTOqXkebi1UclqXF2DUlS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1Lj/B4Y7TjnYX5nXAAAAAElFTkSuQmCC\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"all_data['Cabin'] = all_data['Cabin'].fillna('Unknown')\n",
"all_data['Deck']=all_data['Cabin'].str.get(0)\n",
"sns.barplot(x=\"Deck\", y=\"Survived\", data=all_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"10)TicketGroup Feature(New)与2至4人共票号的乘客幸存率较高"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"新增TicketGroup特征统计每个乘客的共票号数。"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='TicketGroup', ylabel='Survived'>"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAUgElEQVR4nO3dfZQldX3n8fdnBhFFHqLMOoaBhVUUXDFgRkJCVlTUoNmF9SmCkk1OSDjZFWOMOqtHl0MwJnE0ZknERGKIWWMgBLPZUTHoKuIGH2AGBWSILvI4Ix0GAQV0AwPf/eNW46WnZ+bSTHV1z+/9OqfPvbdu1b0fUPrT9auqX6WqkCS1a8nQASRJw7IIJKlxFoEkNc4ikKTGWQSS1Lhdhg7wSO2zzz51wAEHDB1DkhaVdevW3V5Vy2Z7b9EVwQEHHMDatWuHjiFJi0qSm7b2nkNDktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMYtugvKpFWrVjE1NcXy5ctZvXr10HGkRc8i0KIzNTXFxo0bh44h7TQcGpKkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhrXaxEkOTbJN5Ncl+Rts7y/f5KLk3wtyVVJXtZnHknSlnorgiRLgbOAlwLPBE5M8swZq70TOL+qDgdOAD7YVx5J0uz63CM4Ariuqq6vqvuA84DjZ6xTwJ7d872A7/SYR5I0iz6LYF/glrHXG7pl404HTkqyAbgQeMNsH5TklCRrk6zdtGlTH1klqVlDHyw+EfhIVa0AXgZ8NMkWmarq7KpaWVUrly1bNu8hJWln1mcRbAT2G3u9ols27mTgfICq+jKwG7BPj5kkSTP0WQSXAwclOTDJrowOBq+Zsc7NwDEASQ5hVASO/UjSPOqtCKpqM3AqcBFwLaOzg65JckaS47rV3gz8WpIrgXOBX66q6iuTJGlLvd6zuKouZHQQeHzZaWPP1wNH9ZlBkrRtQx8sliQNzCKQpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjev1OgItfqtWrWJqaorly5ezevXqoeNI6oFFoG2amppi48aZU0RJ2pk4NCRJjbMIJKlxDg1pMJc87+g5bffDXZZCwg83bJjTZxz9xUvm9L3Szso9AklqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjLAJJapxFIEmNswgkqXEWgSQ1ziKQpMZZBJLUOItAkhpnEUhS4ywCSWqcdyhrxFF/fNScttv1rl1ZwhJuueuWOX3GpW+4dE7fK2n+uEcgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjvI5AUvNWrVrF1NQUy5cvZ/Xq1UPHmXcWgaTmTU1NsXHjxqFjDKbXoaEkxyb5ZpLrkrxtK+v8QpL1Sa5J8td95pEkbam3PYIkS4GzgBcDG4DLk6ypqvVj6xwEvB04qqruTPKv+sojSZpdn3sERwDXVdX1VXUfcB5w/Ix1fg04q6ruBKiq23rMI0maRZ9FsC9wy9jrDd2ycU8Hnp7k0iRfSXLsbB+U5JQka5Os3bRpU09xJalNQ58+ugtwEPB84ETgz5LsPXOlqjq7qlZW1cply5bNb0JJ2sn1edbQRmC/sdcrumXjNgBfrar7gRuSfItRMVzeY64Fq/VT2CQNo889gsuBg5IcmGRX4ARgzYx1/p7R3gBJ9mE0VHR9j5kWtOlT2KampoaOsqDtXcUTq9i7augo0k6htz2Cqtqc5FTgImApcE5VXZPkDGBtVa3p3ntJkvXAA8Bbq+q7fWXSzuGkBx4cOoK0U+n1grKquhC4cMay08aeF/Bb3Y8kaQBDHyyWJA3MIpCkxlkEktQ4i0CSGmcRSFLjLAJJapz3I9A21eOLB3mQerwXb0k7K4tA23T/UfcPHUFSzxwakqTGWQSS1DiLQJIat81jBEnuBrZ6lLCq9tzhiaRFyCnEtZhtswiqag+AJO8CbgU+CgR4HfCU3tNJi8T0FOLSYjTp0NBxVfXBqrq7qr5fVX/ClvcfliQtQpMWwb1JXpdkaZIlSV4H3NtnMEnS/Jj0OoLXAmd2PwVc2i3TLG4+49A5bbf5jicCu7D5jpvm9Bn7n3b1nL5XUtsmKoKquhGHgiRppzTR0FCSpyf5XJJvdK+fneSd/UaTJM2HSY8R/BnwduB+gKq6itHN6CVJi9ykRfD4qrpsxrLNOzqMJGn+TVoEtyd5Kt3FZUlexei6AknSIjfpWUOvB84GDk6yEbiB0UVlkqRFbtIiuKmqXpRkd2BJVd3dZyhJ0vyZdGjohiRnA0cC9/SYR5I0zyYtgoOB/81oiOiGJB9I8rP9xZIkzZeJiqCqflBV51fVK4DDgT2BS3pNJkmaFxPfjyDJ0Uk+CKwDdgN+obdUkqR5M9HB4iQ3Al8DzgfeWlVOOCdJO4lJzxp6dlV9v9ckkqRBbO8OZauqajXw7iRb3Kmsqn6jt2SSpHmxvT2Ca7vHtX0HkSQNY3u3qvxE9/TqqrpiHvJIkubZpGcN/UGSa5O8K8mzek0kSZpXk15H8ALgBcAm4ENJrvZ+BDvePrs9yJMft5l9dntw6CiSGjLpWUNU1RTwR0kuBlYBpwG/01ewFr3l2XcNHUFSgya9Q9khSU5PcjXwx8CXgBW9JpMkzYtJ9wjOAc4Dfq6qvtNjHkk70KpVq5iammL58uWsXr166DhaoLZbBEmWAjdU1ZnzkEfSDjQ1NcXGjRuHjqEFbrtDQ1X1ALBfkl3nIY8kaZ5NOjR0A3BpkjXAQ/MMVdX7t7VRkmOBM4GlwIer6ve3st4rgQuA51aVF69J0jyatAi+3f0sAfaYZINuSOks4MXABuDyJGuqav2M9fYA3gh8ddLQkqQdZ6IiqKrfnsNnHwFcV1XXAyQ5DzgeWD9jvXcB7wHeOofvkCQ9SpNOQ30xMNukcy/cxmb7AreMvd4A/NSMz30OsF9VfSrJVosgySnAKQD777//JJElSROadGjoLWPPdwNeCWx+NF+cZAnwfuCXt7duVZ0NnA2wcuXKLQpJkjR3kw4NrZux6NIkl21ns43AfmOvV3TLpu0BPAv4QhKA5cCaJMd5wFiS5s+kQ0NPHHu5BFgJ7LWdzS4HDkpyIKMCOAF47fSbVfU9YJ+x7/gC8BZLQJLm16RDQ+v40TGCzcCNwMnb2qCqNic5FbiI0emj51TVNUnOANZW1Zq5RZak2Z1++ulz2u6OO+546HEunzHX710otneHsucCt1TVgd3rX2J0fOBGtjz7ZwtVdSFw4Yxlp21l3edPlFjSoua0FwvP9q4s/hBwH0CS5wG/B/wl8D26g7eS9EhMT3sxNTU1dBR1tjc0tLSq7uievwY4u6o+Dnw8ydd7TSZJmhfb2yNYmmS6LI4BPj/23sT3MpAkLVzb+2V+LnBJktuBHwL/ByDJ0xgND0mSFrnt3bz+3Uk+BzwF+ExVTZ85tAR4Q9/hJEn92+7wTlV9ZZZl3+onjiRpvk10q0pJ0s7LIpCkxlkEktQ4i0CSGue1ANIMH3jzJx7xNnfdfu9Dj3PZ/tQ/+A+PeBtpR3GPQJIaZxFIUuMsAklqnEUgSY1r9mCxc6JL0kizRTA9J7oktc6hIUlqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGmcRSFLjFv0FZT/51v8xp+32uP1ulgI33373nD5j3Xv/05y+V5IWGvcIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY1b9BeUzdWDu+7+sEdJalWzRXDvQS8ZOoIkLQgODUlS4ywCSWqcRSBJjev1GEGSY4EzgaXAh6vq92e8/1vArwKbgU3Ar1TVTX1mkhajd5/0qjltd8dt3xs9Tt06p894x19dMKfv1eLSWxEkWQqcBbwY2ABcnmRNVa0fW+1rwMqq+kGS/wysBl7TVyZJO8617/78nLa7744fPvQ4l8845B0vnNP3auv6HBo6Ariuqq6vqvuA84Djx1eoqour6gfdy68AK3rMI0maRZ9FsC9wy9jrDd2yrTkZ+PRsbyQ5JcnaJGs3bdq0AyNKkhbEweIkJwErgffO9n5VnV1VK6tq5bJly+Y3nCTt5Po8WLwR2G/s9Ypu2cMkeRHwDuDoqvqXHvNIkmbR5x7B5cBBSQ5MsitwArBmfIUkhwMfAo6rqtt6zCJJ2oreiqCqNgOnAhcB1wLnV9U1Sc5Icly32nuBJwB/m+TrSdZs5eMkST3p9TqCqroQuHDGstPGnr+oz++XJG3fgjhYLEkajkUgSY2zCCSpcRaBJDXOIpCkxlkEktS4Zm9VKUnTHvvYxz7ssTUWgaTmHXrooUNHGJRDQ5LUOPcIpB1g9133fNijtJhYBNIOcNRTXzF0BGnOHBqSpMZZBJLUOItAkhpnEUhS4ywCSWqcRSBJjbMIJKlxFoEkNc4ikKTGWQSS1DiLQJIaZxFIUuMsAklqnEUgSY2zCCSpcRaBJDXOIpCkxlkEktQ4i0CSGuc9i6Wd2G5LlzzsUZqNRSDtxA5/0h5DR9Ai4J8JktQ49wgkzasn7bbXwx41PItA0rw69fDXDh1BMzg0JEmNswgkqXEWgSQ1ziKQpMb1WgRJjk3yzSTXJXnbLO8/NsnfdO9/NckBfeaRJG2ptyJIshQ4C3gp8EzgxCTPnLHaycCdVfU04A+B9/SVR5I0uz73CI4Arquq66vqPuA84PgZ6xwP/GX3/ALgmCTpMZMkaYZUVT8fnLwKOLaqfrV7/YvAT1XVqWPrfKNbZ0P3+tvdOrfP+KxTgFO6l88AvrmDYu4D3L7dteaXmSZjpsktxFxmmsyOzPSvq2rZbG8sigvKqups4Owd/blJ1lbVyh39uY+GmSZjpsktxFxmmsx8ZepzaGgjsN/Y6xXdslnXSbILsBfw3R4zSZJm6LMILgcOSnJgkl2BE4A1M9ZZA/xS9/xVwOerr7EqSdKsehsaqqrNSU4FLgKWAudU1TVJzgDWVtUa4M+Bjya5DriDUVnMpx0+3LQDmGkyZprcQsxlpsnMS6beDhZLkhYHryyWpMZZBJLUuCaLIMk5SW7rrmMYXJL9klycZH2Sa5K8cehMAEl2S3JZkiu7XL89dKZpSZYm+VqSTw6dBSDJjUmuTvL1JGuHzgOQZO8kFyT5pyTXJvnpBZDpGd2/o+mf7yf5zQWQ603d/8e/keTcJLsNkGGL30tJXt3lejBJb6eRNlkEwEeAY4cOMWYz8OaqeiZwJPD6WabjGMK/AC+sqp8ADgOOTXLksJEe8kbg2qFDzPCCqjpsAZ2LfibwD1V1MPATLIB/X1X1ze7f0WHATwI/AP7nkJmS7Av8BrCyqp7F6OSW+T5xBWb/vfQN4BXAF/v84iaLoKq+yOgspQWhqm6tqiu653cz+g9232FTQY3c0718TPcz+NkFSVYAPw98eOgsC1WSvYDnMTozj6q6r6ruGjTUlo4Bvl1VNw0dhNEZlI/rrmd6PPCd+Q4w2++lqrq2qnbUTApb1WQRLGTdDKyHA18dOArw0BDM14HbgM9W1ULI9d+BVcCDA+cYV8BnkqzrpkQZ2oHAJuAvuiG0DyfZfehQM5wAnDt0iKraCLwPuBm4FfheVX1m2FTzyyJYQJI8Afg48JtV9f2h8wBU1QPdbvwK4IgkzxoyT5J/D9xWVeuGzDGLn62q5zCabff1SZ43cJ5dgOcAf1JVhwP3AltMBT+U7iLT44C/XQBZfozRBJgHAj8O7J7kpGFTzS+LYIFI8hhGJfCxqvq7ofPM1A0rXMzwx1aOAo5LciOjGW1fmOSvho300F+VVNVtjMa8jxg2ERuADWN7cBcwKoaF4qXAFVX1z0MHAV4E3FBVm6rqfuDvgJ8ZONO8sggWgG7q7T8Hrq2q9w+dZ1qSZUn27p4/Dngx8E9DZqqqt1fViqo6gNHQwueratC/3pLsnmSP6efASxgd5BtMVU0BtyR5RrfoGGD9gJFmOpEFMCzUuRk4Msnju/8Wj2EBHFifT00WQZJzgS8Dz0iyIcnJA0c6CvhFRn/dTp9W97KBMwE8Bbg4yVWM5o76bFUtiNM1F5gnA/+Y5ErgMuBTVfUPA2cCeAPwse5/v8OA3x02zkhXli9m9Jf34Lq9pguAK4CrGf1enPfpJmb7vZTk5Uk2AD8NfCrJRb18t1NMSFLbmtwjkCT9iEUgSY2zCCSpcRaBJDXOIpCkxlkE2ikledLYqbhTSTZ2z+9J8sHtbHvPtt6fse7zk/zMjGUnJbmqmzXyym56h73n+I8i9a63W1VKQ6qq7zI6d54kpwP3VNX7eviq5wP3AF/qvutY4E3AS6tqY5KljO7L/WTgrvENkyytqgd6yCQ9Iu4RqCndX/Cf7J4/IclfdPcRuCrJK2esu0+SLyf5+e4q648nubz7OaqbIPDXgTd1exv/DngH8JaxKSceqKpzpmeQ7O5b8J4kVwCvTnJi9/3fSPKese++Z+z5q5J8pHv+kSR/mmRtkm91cy9Jj4p7BGrZf2M00+Sh8NDkY3TPnwysAd5ZVZ9N8tfAH1bVPybZH7ioqg5J8qeM7W0k+beMrlDdlu9W1XOS/DjwFUbz8t/JaPbS/1hVf7+d7Q9gNJfRUxld+f20qvp/j+wfXfoR9wjUshcBZ02/qKo7u6ePAT4HrKqqz46t+4FuSu41wJ7dbLFbleTQbk/h20leM/bW33SPzwW+0E12thn4GKN7CGzP+VX1YFX9X+B64OAJtpG2yiKQtrQZWAf83NiyJcCR03fXqqp9x27aM+4aulk+q+rqbgrvTwOPG1vn3gkyjM/9MvO2iTPnhXGeGD0qFoFa9lng9dMvxoaGCvgV4OAk/7Vb9hlGk7hNr3tY9/RuYI+xz/w94H3dXdSmjZfAuMuAo7tjEUsZzch5SffePyc5JMkS4OUztnt1kiVJngr8G6D3O1hp52YRqGW/A/xYd6D2SuAF0290Z/OcyGhG2P9Cd0/b7qDyekYHiQE+Abx8+mBxVV0I/BHw6STrk3wJeADYYtbIqrqV0c1iLgauBNZV1f/q3n4b8ElGZyPdOmPTmxmVyKeBX/f4gB4tZx+VFpHu7KFPVtUFQ2fRzsM9AklqnHsEktQ49wgkqXEWgSQ1ziKQpMZZBJLUOItAkhr3/wHCONKiV+yIFwAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"Ticket_Count = dict(all_data['Ticket'].value_counts())\n",
"all_data['TicketGroup'] = all_data['Ticket'].apply(lambda x:Ticket_Count[x])\n",
"sns.barplot(x='TicketGroup', y='Survived', data=all_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"按生存率把TicketGroup分为三类。"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<AxesSubplot:xlabel='TicketGroup', ylabel='Survived'>"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAASPUlEQVR4nO3df5BdZ13H8fcniRGBgj+6Sm1SUyEIESrgUh2r/JB2TNVpVUAbUGBEM4wE8AfGMGLVouNQFccfUYhaRUcMFRxdMEyoWFCLQDa1UJJMIKaFZiWytBUp/miXfv1jT/C6vcneLTl7s3ner5k7Oc85z7n3u71pPnuec85zUlVIktq1atwFSJLGyyCQpMYZBJLUOINAkhpnEEhS49aMu4ClOvfcc2vDhg3jLkOSVpT9+/d/qqomhm1bcUGwYcMGpqenx12GJK0oST52sm0ODUlS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIat+JuKJOk02379u0cP36cRz3qUVx77bXjLmfZGQSSmnf8+HFmZmbGXcbYODQkSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmN6zUIkmxOcjjJkSQ7TtLn+5McTHIgyZv6rEeS9EC9TTGRZDWwE7gMOAbsSzJVVQcH+mwEXgVcUlV3J/nKvuqRJA3X5xHBxcCRqjpaVfcCu4ErF/T5UWBnVd0NUFWf7LEeSdIQfQbB+cAdA+1j3bpBjwUem+SmJO9LsnnYGyXZmmQ6yfTs7GxP5UpSm8Z9sngNsBF4BrAF+P0kX7qwU1XtqqrJqpqcmJhY3gol6SzXZxDMAOsH2uu6dYOOAVNVdV9V3QZ8hPlgkCQtkz6DYB+wMcmFSdYCVwFTC/r8FfNHAyQ5l/mhoqM91iRJWqC3IKiqOWAbsBc4BFxfVQeSXJPkiq7bXuDOJAeBG4Gfrqo7+6pJkvRAqapx17Akk5OTNT09Pe4yJJ3CJb99ybhLWJK1N6xl1WdXcf/D7ufey+4ddzkju+llN43cN8n+qpoctm3cJ4slSWNmEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGrdm3AVI0rjVQ4v7uZ966Mp6YuPpYhBIat59l9w37hLGyqEhSWqcQSBJjTMIJKlxBoEkNa7XIEiyOcnhJEeS7Biy/UVJZpPc0r1+pM96JEkP1NtVQ0lWAzuBy4BjwL4kU1V1cEHXN1fVtr7qkCSdWp9HBBcDR6rqaFXdC+wGruzx8yRJD0KfQXA+cMdA+1i3bqFnJ/lQkrckWT/sjZJsTTKdZHp2draPWiWpWeM+Wfw2YENVXQTcALxxWKeq2lVVk1U1OTExsawFStLZrs8gmAEGf8Nf1637vKq6s6r+p2v+AfCNPdYjSRqizyDYB2xMcmGStcBVwNRghyTnDTSvAA71WI8kaYjerhqqqrkk24C9wGrguqo6kOQaYLqqpoCXJ7kCmAPuAl7UVz2SpOF6nXSuqvYAexasu3pg+VXAq/qsQZJ0auM+WSxJGjODQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNa7XIEiyOcnhJEeS7DhFv2cnqSSTfdYjSXqg3oIgyWpgJ3A5sAnYkmTTkH7nAK8A3t9XLZKkk+vziOBi4EhVHa2qe4HdwJVD+r0GeC3w3z3WIkk6iT6D4HzgjoH2sW7d5yV5CrC+qv6mxzokSacwtpPFSVYBrwN+aoS+W5NMJ5menZ3tvzhJakifQTADrB9or+vWnXAO8ATg3UluB74ZmBp2wriqdlXVZFVNTkxM9FiyJLVnzak2JvkMUCfbXlWPOMXu+4CNSS5kPgCuAp43sO+ngXMHPuvdwCuranqkyiVJp8Upg6CqzgFI8hrgE8CfAgGeD5y3yL5zSbYBe4HVwHVVdSDJNcB0VU2dhvolSV+gUwbBgCuq6hsG2r+X5IPA1afaqar2AHsWrBu6T1U9Y8RaJEmn0ajnCD6b5PlJVidZleT5wGf7LEyStDxGDYLnAd8P/Fv3ei4D4/2SpJVrpKGhqrqd4TeDSZJWuJGOCJI8Nsm7kny4a1+U5NX9liZJWg6jDg39PvAq4D6AqvoQ85eDSpJWuFGD4KFV9YEF6+ZOdzGSpOU3ahB8Ksmj6W4uS/Ic5u8rkCStcKPeR/BSYBfwuCQzwG3M31QmSVrhRg2Cj1XVpUkeBqyqqs/0WZQkafmMOjR0W5JdzE8Md0+P9UiSltmoQfA44G+ZHyK6LcnvJPnW/sqSJC2XkYKgqv6zqq6vqu8Dngw8AnhPr5VJkpbFyM8jSPL0JL8L7AcewvyUE5KkFW6kk8Xdg2P+Gbge+OmqcsI5STpLjHrV0EVV9R+9ViJJGovFnlC2vaquBX45yQOeVFZVL++tMknSsljsiOBQ96ePj5Sks9Rij6p8W7d4a1XdvAz1SJKW2ahXDf16kkNJXpPkCb1WJElaVqPeR/BM4JnALPCGJLf6PAJJOjuMfB9BVR2vqt8CXgLcwiIPrpckrQyjPqHs8Ul+IcmtwG8D7wXW9VqZJGlZjHofwXXAbuA7qupfe6xHkrTMFg2CJKuB26rqN5ehHknSMlt0aKiqPgesT7J2qW+eZHOSw0mOJNkxZPtLuhPPtyT5xySblvoZ0plg+/btvOAFL2D79u3jLkVaslGHhm4DbkoyBXx+nqGqet3JduiOJHYClwHHgH1Jpqrq4EC3N1XV67v+VwCvAzYv7UeQxu/48ePMzMyMuwzpQRk1CP6le60Czhlxn4uBI1V1FCDJbuBK4PNBsGD+oofRPRNZkrR8RgqCqvrFB/He5wN3DLSPAd+0sFOSlwI/CawFvn3YGyXZCmwFuOCCCx5EKZKkkxl1GuobGfLbelUN/Yd7KapqJ7AzyfOAVwMvHNJnF7ALYHJy0qMGSTqNRh0aeuXA8kOAZwNzi+wzA6wfaK/r1p3MbuD3RqxHknSajDo0tH/BqpuSfGCR3fYBG5NcyHwAXAU8b7BDko1V9dGu+V3AR5EkLatRh4a+fKC5CpgEHnmqfapqLsk2YC+wGriuqg4kuQaYrqopYFuSS4H7gLsZMiwkSerXqEND+/m/cwRzwO3Aixfbqar2AHsWrLt6YPkVI36+JKkniz2h7KnAHVV1Ydd+IfPnB25n4DJQSdLKtdidxW8A7gVI8jTgV4A3Ap+mu4pHkrSyLTY0tLqq7uqWfwDYVVVvBd6a5JZeK5MkLYtFgyDJmqqaA55Fd1PXiPtKX5CPX/PEcZcwsrm7vhxYw9xdH1tRdV9w9a3jLkFngMX+Mf9z4D1JPgX8F/APAEkew/zwkCRphVvs4fW/nORdwHnAO6vqxJVDq4CX9V2cJKl/iw7vVNX7hqz7SD/lSJKW28jPLJYknZ0MAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjnDhOOg3Ofcj9wFz3p7SyGATSafDKi/593CVID5pDQ5LUOINAkhpnEEhS4wwCSWqcQSBJjes1CJJsTnI4yZEkO4Zs/8kkB5N8KMm7knxNn/VIkh6otyBIshrYCVwObAK2JNm0oNs/A5NVdRHwFuDavuqRJA3X5xHBxcCRqjpaVfcCu4ErBztU1Y1V9Z9d833Auh7rkSQN0WcQnA/cMdA+1q07mRcD7xi2IcnWJNNJpmdnZ09jiZKkM+JkcZIfBCaBXx22vap2VdVkVU1OTEwsb3GSdJbrc4qJGWD9QHtdt+7/SXIp8LPA06vqf3qsR5I0RJ9HBPuAjUkuTLIWuAqYGuyQ5MnAG4ArquqTPdYiSTqJ3oKgquaAbcBe4BBwfVUdSHJNkiu6br8KPBz4iyS3JJk6ydtJknrS6+yjVbUH2LNg3dUDy5f2+fmSpMWdESeLJUnjYxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTG9RoESTYnOZzkSJIdQ7Y/LcnNSeaSPKfPWiRJw/UWBElWAzuBy4FNwJYkmxZ0+zjwIuBNfdUhSTq1NT2+98XAkao6CpBkN3AlcPBEh6q6vdt2f491SJJOoc+hofOBOwbax7p1S5Zka5LpJNOzs7OnpThJ0rwVcbK4qnZV1WRVTU5MTIy7HEk6q/QZBDPA+oH2um6dJOkM0mcQ7AM2JrkwyVrgKmCqx8+TJD0IvQVBVc0B24C9wCHg+qo6kOSaJFcAJHlqkmPAc4E3JDnQVz2SpOH6vGqIqtoD7Fmw7uqB5X3MDxlJksZkRZwsliT1xyCQpMYZBJLUOINAkhpnEEhS4wwCSWqcQSBJjTMIJKlxBoEkNc4gkKTGGQSS1DiDQJIaZxBIUuMMAklqnEEgSY0zCCSpcQaBJDXOIJCkxhkEktQ4g0CSGmcQSFLjDAJJapxBIEmNMwgkqXG9BkGSzUkOJzmSZMeQ7V+c5M3d9vcn2dBnPZKkB+otCJKsBnYClwObgC1JNi3o9mLg7qp6DPAbwGv7qkeSNFyfRwQXA0eq6mhV3QvsBq5c0OdK4I3d8luAZyVJjzVJkhZY0+N7nw/cMdA+BnzTyfpU1VySTwNfAXxqsFOSrcDWrnlPksO9VHxmOJcFP79WjJX33f28v3cNWHHfX16+pO/va062oc8gOG2qahewa9x1LIck01U1Oe46tHR+dytby99fn0NDM8D6gfa6bt3QPknWAI8E7uyxJknSAn0GwT5gY5ILk6wFrgKmFvSZAl7YLT8H+Luqqh5rkiQt0NvQUDfmvw3YC6wGrquqA0muAaaragr4Q+BPkxwB7mI+LFrXxBDYWcrvbmVr9vuLv4BLUtu8s1iSGmcQSFLjDIIzxGLTcejMleS6JJ9M8uFx16KlS7I+yY1JDiY5kOQV465puXmO4AzQTcfxEeAy5m+82wdsqaqDYy1MI0nyNOAe4E+q6gnjrkdLk+Q84LyqujnJOcB+4Hta+v/PI4IzwyjTcegMVVV/z/xVb1qBquoTVXVzt/wZ4BDzsx40wyA4MwybjqOpv4jSmaCbAfnJwPvHXMqyMggkCUjycOCtwI9X1X+Mu57lZBCcGUaZjkNST5J8EfMh8GdV9Zfjrme5GQRnhlGm45DUg27q+z8EDlXV68ZdzzgYBGeAqpoDTkzHcQi4vqoOjLcqjSrJnwP/BHxdkmNJXjzumrQklwA/BHx7klu613eOu6jl5OWjktQ4jwgkqXEGgSQ1ziCQpMYZBJLUOINAkhpnEOislOQrBi4FPJ5kplu+J8nvLrLvPUv4nGck+ZYF634wyYe6mSw/mOQPknzpg/xRpN719qhKaZyq6k7gSQBJfgG4p6p+rYePegbzM4++t/uszcBPAJdX1Uw3s+wLga8C/n1wxySrq+pzPdQkLYlHBGpK9xv827vlhyf5oyS3dr/BP3tB33OT/FOS70oykeStSfZ1r0u6CcpeAvxEd7TxbcDPAq+sqhmAqvpcVV1XVYe797w9yWuT3Aw8N8mW7vM/nOS1A599z8Dyc5L8cbf8x0len2Q6yUeSfHev/8HUBI8I1LKfAz5dVU8ESPJlJzYk+Srmp/l4dVXdkORNwG9U1T8muQDYW1WPT/J6Bo42knw9cPMin3tnVT0lyVcD7wO+EbgbeGeS76mqv1pk/w3MT13+aODGJI+pqv9e2o8u/R+PCNSyS4GdJxpVdXe3+EXAu4DtVXXDQN/fSXIL8wHxiG62ypNK8sTuSOFfkvzAwKY3d38+FXh3Vc1204z8GfC0Eeq+vqrur6qPAkeBx42wj3RSBoH0QHPMP6XqOwbWrQK+uaqe1L3Or6phJ5UPAE8BqKpbq+pJwDuALxno89kRahic++Uhp9g2rC0tiUGglt0AvPREY2BoqIAfBh6X5Ge6de8EXjbQ90nd4meAcwbe81eAX0uybmDdYAgM+gDw9O5cxGpgC/Cebtu/JXl8klXA9y7Y77lJViV5NPC1wOFFf1LpFAwCteyXgC/rTtR+EHjmiQ3d1TxbmJ+R8seAlwOT3Unlg8yfJAZ4G/C9J04WV9Ue4LeAd3QPQ38v8DnmZ5b9f6rqE8AO4Ebgg8D+qvrrbvMO4O3MX430iQW7fpz5EHkH8BLPD+gL5eyj0grSXT309qp6y7hr0dnDIwJJapxHBJLUOI8IJKlxBoEkNc4gkKTGGQSS1DiDQJIa97/P2tIX6QqKggAAAABJRU5ErkJggg==\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"def Ticket_Label(s):\n",
" if (s >= 2) & (s <= 4):\n",
" return 2\n",
" elif ((s > 4) & (s <= 8)) | (s == 1):\n",
" return 1\n",
" elif (s > 8):\n",
" return 0\n",
"\n",
"all_data['TicketGroup'] = all_data['TicketGroup'].apply(Ticket_Label)\n",
"sns.barplot(x='TicketGroup', y='Survived', data=all_data)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3.数据清洗"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1)缺失值填充"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Age FeatureAge缺失量为263缺失量较大用Sex, Title, Pclass三个特征构建随机森林模型填充年龄缺失值。"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.ensemble import RandomForestRegressor"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"age_df = all_data[['Age', 'Pclass','Sex','Title']]\n",
"age_df=pd.get_dummies(age_df)\n",
"known_age = age_df[age_df.Age.notnull()].values\n",
"unknown_age = age_df[age_df.Age.isnull()].values\n",
"y = known_age[:, 0]\n",
"X = known_age[:, 1:]\n",
"rfr = RandomForestRegressor(random_state=0, n_estimators=100, n_jobs=-1)\n",
"rfr.fit(X, y)\n",
"predictedAges = rfr.predict(unknown_age[:, 1::])\n",
"all_data.loc[ (all_data.Age.isnull()), 'Age' ] = predictedAges "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Embarked FeatureEmbarked缺失量为2缺失Embarked信息的乘客的Pclass均为1且Fare均为80因为Embarked为C且Pclass为1的乘客的Fare中位数为80所以缺失值填充为C。"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>PassengerId</th>\n",
" <th>Survived</th>\n",
" <th>Pclass</th>\n",
" <th>Name</th>\n",
" <th>Sex</th>\n",
" <th>Age</th>\n",
" <th>SibSp</th>\n",
" <th>Parch</th>\n",
" <th>Ticket</th>\n",
" <th>Fare</th>\n",
" <th>Cabin</th>\n",
" <th>Embarked</th>\n",
" <th>Title</th>\n",
" <th>FamilySize</th>\n",
" <th>FamilyLabel</th>\n",
" <th>Deck</th>\n",
" <th>TicketGroup</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>61</th>\n",
" <td>62</td>\n",
" <td>1.0</td>\n",
" <td>1</td>\n",
" <td>Icard, Miss. Amelie</td>\n",
" <td>female</td>\n",
" <td>38.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>113572</td>\n",
" <td>80.0</td>\n",
" <td>B28</td>\n",
" <td>NaN</td>\n",
" <td>Miss</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>B</td>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>829</th>\n",
" <td>830</td>\n",
" <td>1.0</td>\n",
" <td>1</td>\n",
" <td>Stone, Mrs. George Nelson (Martha Evelyn)</td>\n",
" <td>female</td>\n",
" <td>62.0</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>113572</td>\n",
" <td>80.0</td>\n",
" <td>B28</td>\n",
" <td>NaN</td>\n",
" <td>Mrs</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>B</td>\n",
" <td>2</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" PassengerId Survived Pclass Name \\\n",
"61 62 1.0 1 Icard, Miss. Amelie \n",
"829 830 1.0 1 Stone, Mrs. George Nelson (Martha Evelyn) \n",
"\n",
" Sex Age SibSp Parch Ticket Fare Cabin Embarked Title \\\n",
"61 female 38.0 0 0 113572 80.0 B28 NaN Miss \n",
"829 female 62.0 0 0 113572 80.0 B28 NaN Mrs \n",
"\n",
" FamilySize FamilyLabel Deck TicketGroup \n",
"61 1 1 B 2 \n",
"829 1 1 B 2 "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"all_data[all_data['Embarked'].isnull()]"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Pclass Embarked\n",
"1 C 76.7292\n",
" Q 90.0000\n",
" S 52.0000\n",
"2 C 15.3146\n",
" Q 12.3500\n",
" S 15.3750\n",
"3 C 7.8958\n",
" Q 7.7500\n",
" S 8.0500\n",
"Name: Fare, dtype: float64"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"all_data.groupby(by=[\"Pclass\",\"Embarked\"]).Fare.median()"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"all_data['Embarked'] = all_data['Embarked'].fillna('C')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Fare FeatureFare缺失量为1缺失Fare信息的乘客的Embarked为SPclass为3所以用Embarked为SPclass为3的乘客的Fare中位数填充。"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>PassengerId</th>\n",
" <th>Survived</th>\n",
" <th>Pclass</th>\n",
" <th>Name</th>\n",
" <th>Sex</th>\n",
" <th>Age</th>\n",
" <th>SibSp</th>\n",
" <th>Parch</th>\n",
" <th>Ticket</th>\n",
" <th>Fare</th>\n",
" <th>Cabin</th>\n",
" <th>Embarked</th>\n",
" <th>Title</th>\n",
" <th>FamilySize</th>\n",
" <th>FamilyLabel</th>\n",
" <th>Deck</th>\n",
" <th>TicketGroup</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1043</th>\n",
" <td>1044</td>\n",
" <td>NaN</td>\n",
" <td>3</td>\n",
" <td>Storey, Mr. Thomas</td>\n",
" <td>male</td>\n",
" <td>60.5</td>\n",
" <td>0</td>\n",
" <td>0</td>\n",
" <td>3701</td>\n",
" <td>NaN</td>\n",
" <td>Unknown</td>\n",
" <td>S</td>\n",
" <td>Mr</td>\n",
" <td>1</td>\n",
" <td>1</td>\n",
" <td>U</td>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" PassengerId Survived Pclass Name Sex Age SibSp \\\n",
"1043 1044 NaN 3 Storey, Mr. Thomas male 60.5 0 \n",
"\n",
" Parch Ticket Fare Cabin Embarked Title FamilySize FamilyLabel \\\n",
"1043 0 3701 NaN Unknown S Mr 1 1 \n",
"\n",
" Deck TicketGroup \n",
"1043 U 1 "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"all_data[all_data['Fare'].isnull()]"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"fare=all_data[(all_data['Embarked'] == \"S\") & (all_data['Pclass'] == 3)].Fare.median()\n",
"all_data['Fare']=all_data['Fare'].fillna(fare)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2)同组识别"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"把姓氏相同的乘客划分为同一组,从人数大于一的组中分别提取出每组的妇女儿童和成年男性。"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"all_data['Surname']=all_data['Name'].apply(lambda x:x.split(',')[0].strip())\n",
"Surname_Count = dict(all_data['Surname'].value_counts())\n",
"all_data['FamilyGroup'] = all_data['Surname'].apply(lambda x:Surname_Count[x])\n",
"Female_Child_Group=all_data.loc[(all_data['FamilyGroup']>=2) & ((all_data['Age']<=12) | (all_data['Sex']=='female'))]\n",
"Male_Adult_Group=all_data.loc[(all_data['FamilyGroup']>=2) & (all_data['Age']>12) & (all_data['Sex']=='male')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"发现绝大部分女性和儿童组的平均存活率都为1或0即同组的女性和儿童要么全部幸存要么全部遇难。"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GroupCount</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1.000000</th>\n",
" <td>115</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0.000000</th>\n",
" <td>31</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0.750000</th>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0.333333</th>\n",
" <td>1</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0.142857</th>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GroupCount\n",
"1.000000 115\n",
"0.000000 31\n",
"0.750000 2\n",
"0.333333 1\n",
"0.142857 1"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Female_Child=pd.DataFrame(Female_Child_Group.groupby('Surname')['Survived'].mean().value_counts())\n",
"Female_Child.columns=['GroupCount']\n",
"Female_Child"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Text(0.5, 0, 'AverageSurvived')"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
},
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 432x288 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"sns.barplot(x=Female_Child.index, y=Female_Child[\"GroupCount\"]).set_xlabel('AverageSurvived')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"绝大部分成年男性组的平均存活率也为1或0。"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>GroupCount</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0.000000</th>\n",
" <td>122</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1.000000</th>\n",
" <td>20</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0.500000</th>\n",
" <td>6</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0.333333</th>\n",
" <td>2</td>\n",
" </tr>\n",
" <tr>\n",
" <th>0.250000</th>\n",
" <td>1</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" GroupCount\n",
"0.000000 122\n",
"1.000000 20\n",
"0.500000 6\n",
"0.333333 2\n",
"0.250000 1"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Male_Adult=pd.DataFrame(Male_Adult_Group.groupby('Surname')['Survived'].mean().value_counts())\n",
"Male_Adult.columns=['GroupCount']\n",
"Male_Adult"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"因为普遍规律是女性和儿童幸存率高成年男性幸存较低所以我们把不符合普遍规律的反常组选出来单独处理。把女性和儿童组中幸存率为0的组设置为遇难组把成年男性组中存活率为1的设置为幸存组推测处于遇难组的女性和儿童幸存的可能性较低处于幸存组的成年男性幸存的可能性较高。"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'Strom', 'Cacic', 'Boulos', 'Ford', 'Johnston', 'Robins', 'Arnold-Franchi', 'Vander Planke', 'Bourke', 'Ilmakangas', 'Turpin', 'Van Impe', 'Barbara', 'Skoog', 'Caram', 'Jussila', 'Rosblom', 'Palsson', 'Goodwin', 'Lefebre', 'Sage', 'Lobb', 'Danbom', 'Olsson', 'Rice', 'Lahtinen', 'Attalah', 'Canavan', 'Oreskovic', 'Zabour', 'Panula'}\n",
"{'Goldenberg', 'Harder', 'Taylor', 'Bishop', 'Dick', 'Frauenthal', 'Greenfield', 'Daly', 'Beane', 'Bradley', 'Nakid', 'Jussila', 'Duff Gordon', 'Frolicher-Stehli', 'Beckwith', 'Kimball', 'Chambers', 'Jonsson', 'Cardeza', 'McCoy'}\n"
]
}
],
"source": [
"Female_Child_Group=Female_Child_Group.groupby('Surname')['Survived'].mean()\n",
"Dead_List=set(Female_Child_Group[Female_Child_Group.apply(lambda x:x==0)].index)\n",
"print(Dead_List)\n",
"Male_Adult_List=Male_Adult_Group.groupby('Surname')['Survived'].mean()\n",
"Survived_List=set(Male_Adult_List[Male_Adult_List.apply(lambda x:x==1)].index)\n",
"print(Survived_List)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"为了使处于这两种反常组中的样本能够被正确分类对测试集中处于反常组中的样本的AgeTitleSex进行惩罚修改。"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"train=all_data.loc[all_data['Survived'].notnull()]\n",
"test=all_data.loc[all_data['Survived'].isnull()]\n",
"test.loc[(test['Surname'].apply(lambda x:x in Dead_List)),'Sex'] = 'male'\n",
"test.loc[(test['Surname'].apply(lambda x:x in Dead_List)),'Age'] = 60\n",
"test.loc[(test['Surname'].apply(lambda x:x in Dead_List)),'Title'] = 'Mr'\n",
"test.loc[(test['Surname'].apply(lambda x:x in Survived_List)),'Sex'] = 'female'\n",
"test.loc[(test['Surname'].apply(lambda x:x in Survived_List)),'Age'] = 5\n",
"test.loc[(test['Surname'].apply(lambda x:x in Survived_List)),'Title'] = 'Miss'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3)特征转换"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"选取特征,转换为数值变量,划分训练集和测试集。"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"all_data=pd.concat([train, test])\n",
"all_data=all_data[['Survived','Pclass','Sex','Age','Fare','Embarked','Title','FamilyLabel','Deck','TicketGroup']]\n",
"all_data=pd.get_dummies(all_data)\n",
"train=all_data[all_data['Survived'].notnull()]\n",
"test=all_data[all_data['Survived'].isnull()].drop('Survived',axis=1)\n",
"X = train.values[:,1:]\n",
"y = train.values[:,0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"4.建模和优化"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"1)参数优化"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"用网格搜索自动化选取最优参数事实上我用网格搜索得到的最优参数是n_estimators = 28max_depth = 6。但是参考另一篇Kernel把参数改为n_estimators = 26max_depth = 6之后交叉验证分数和kaggle评分都有略微提升。"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import Pipeline\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.model_selection import GridSearchCV\n",
"from sklearn.feature_selection import SelectKBest"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'classify__max_depth': 6, 'classify__n_estimators': 48} 0.8793119995472937\n"
]
}
],
"source": [
"pipe=Pipeline([('select',SelectKBest(k=20)), \n",
" ('classify', RandomForestClassifier(random_state = 10, max_features = 'sqrt'))])\n",
"\n",
"param_test = {'classify__n_estimators':list(range(20,50,2)), \n",
" 'classify__max_depth':list(range(3,60,3))}\n",
"gsearch = GridSearchCV(estimator = pipe, param_grid = param_test, scoring='roc_auc', cv=10)\n",
"gsearch.fit(X,y)\n",
"print(gsearch.best_params_, gsearch.best_score_)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2)训练模型"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import make_pipeline"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Pipeline(steps=[('selectkbest', SelectKBest(k=20)),\n",
" ('randomforestclassifier',\n",
" RandomForestClassifier(max_depth=6, max_features='sqrt',\n",
" n_estimators=26, random_state=10,\n",
" warm_start=True))])"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"select = SelectKBest(k = 20)\n",
"clf = RandomForestClassifier(random_state = 10, warm_start = True, \n",
" n_estimators = 26,\n",
" max_depth = 6, \n",
" max_features = 'sqrt')\n",
"pipeline = make_pipeline(select, clf)\n",
"pipeline.fit(X, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3)交叉验证"
]
},
{
"cell_type": "code",
"execution_count": 49,
"metadata": {},
"outputs": [],
"source": [
"from sklearn import metrics\n",
"from sklearn import model_selection"
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CV Score : Mean - 0.8462422 | Std - 0.03623982 \n"
]
}
],
"source": [
"cv_score = model_selection.cross_val_score(pipeline, X, y, cv= 10)\n",
"print(\"CV Score : Mean - %.7g | Std - %.7g \" % (np.mean(cv_score), np.std(cv_score)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"5.预测"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"predictions = pipeline.predict(test)\n",
"submission = pd.DataFrame({\"PassengerId\": PassengerId, \"Survived\": predictions.astype(np.int32)})\n",
"submission.to_csv(r\"submission1.csv\", index=False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}