2488 lines
108 KiB
Plaintext
2488 lines
108 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "ojlhGzdxhkwR"
|
||
},
|
||
"source": [
|
||
"# Pandas. Загрузка библиотек"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {
|
||
"id": "xChor81V6mtD"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"## Описание и загрузка библиотеки"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "Z22T_R766hsO"
|
||
},
|
||
"source": [
|
||
" - <a href=\"http://pandas.pydata.org/\">Pandas</a> - библиотека для обработки и анализа данных. Предназначена для данных разной природы - матричных, панельных данных, временных рядов. Претендует на звание самого мощного и гибкого средства для анализа данных с открытым исходным кодом."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Requirement already satisfied: pandas in ./venv/lib/python3.13/site-packages (2.2.3)\n",
|
||
"Requirement already satisfied: numpy>=1.26.0 in ./venv/lib/python3.13/site-packages (from pandas) (2.2.3)\n",
|
||
"Requirement already satisfied: python-dateutil>=2.8.2 in ./venv/lib/python3.13/site-packages (from pandas) (2.9.0.post0)\n",
|
||
"Requirement already satisfied: pytz>=2020.1 in ./venv/lib/python3.13/site-packages (from pandas) (2025.1)\n",
|
||
"Requirement already satisfied: tzdata>=2022.7 in ./venv/lib/python3.13/site-packages (from pandas) (2025.1)\n",
|
||
"Requirement already satisfied: six>=1.5 in ./venv/lib/python3.13/site-packages (from python-dateutil>=2.8.2->pandas) (1.17.0)\n",
|
||
"\n",
|
||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.0.1\u001b[0m\n",
|
||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"!pip install pandas"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {
|
||
"id": "DqYWosnHhkwU"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import pandas as pd # Загружаем модуль pandas"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "ZNKpaic1hkwb"
|
||
},
|
||
"source": [
|
||
"В пандас есть две структуры данных:\n",
|
||
"- Series: одномерный массив с именованными индексами (чаще всего, данные одного типа)\n",
|
||
"- DataFrame: двухмерный массив, имеет табличную структуру, легко изменяется по размерам, может содержать в себе данные разных типов\n",
|
||
"\n",
|
||
"Оба типа можно создавать вручную с помощью функций из самой библиотеки:\n",
|
||
"- pandas.Series(data=None, index=None, dtype=None)\n",
|
||
"- pandas.DataFrame(data=None, index=None, columns=None, dtype=None)\n",
|
||
"\n",
|
||
"- **data** - данные, которые надо записать в структуру\n",
|
||
"- **index** - индексы строк\n",
|
||
"- **columns** - названия столбцов\n",
|
||
"- **dtype** - тип данных\n",
|
||
"\n",
|
||
"Кроме data, остальные параметры опциональны\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "tMHOWBBWhkwf"
|
||
},
|
||
"source": [
|
||
"Мы, конечно, можем сами создавать датафреймы!\n",
|
||
"\n",
|
||
"Например, кто-то нашел нам кусок данных и просит воспроизвести этот датасет:\n",
|
||
"\n",
|
||
"<img src=\"https://i.imgur.com/FUCGiKP.png\">\n",
|
||
"\n",
|
||
"Давайте разберемся, что здесь, что и запишем в известную нам конструкцию - листы. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"id": "9yW-A-fRhkwi"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"columns = ['country', 'province', 'region_1', 'region_2'] # Создаем список, в котором будут храниться названия столбцов\n",
|
||
"index = [0, 1, 10, 100] # Создаем список, в котором будут индексы строк\n",
|
||
"\n",
|
||
"# Создаем список с данными, каждая строка таблицы - отдельный список\n",
|
||
"data = [['Italy', 'Sicily & Sardinia', 'Etna', 'NaN'], \n",
|
||
" ['Portugal', 'Douro', 'NaN', 'NaN'],\n",
|
||
" ['US', 'California', 'Napa Valley', 'Napa'],\n",
|
||
" ['US', 'New York', 'Finger Lakes', 'Finger Lakes']]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "6jUo7y0uhkwo"
|
||
},
|
||
"source": [
|
||
"А теперь соберем в датафрейм"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 173
|
||
},
|
||
"id": "jMEdfOOdhkwp",
|
||
"outputId": "b5fae3e6-3e8d-4297-d468-0be74894b070",
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>country</th>\n",
|
||
" <th>province</th>\n",
|
||
" <th>region_1</th>\n",
|
||
" <th>region_2</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>0</th>\n",
|
||
" <td>Italy</td>\n",
|
||
" <td>Sicily & Sardinia</td>\n",
|
||
" <td>Etna</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>1</th>\n",
|
||
" <td>Portugal</td>\n",
|
||
" <td>Douro</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10</th>\n",
|
||
" <td>US</td>\n",
|
||
" <td>California</td>\n",
|
||
" <td>Napa Valley</td>\n",
|
||
" <td>Napa</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>100</th>\n",
|
||
" <td>US</td>\n",
|
||
" <td>New York</td>\n",
|
||
" <td>Finger Lakes</td>\n",
|
||
" <td>Finger Lakes</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" country province region_1 region_2\n",
|
||
"0 Italy Sicily & Sardinia Etna NaN\n",
|
||
"1 Portugal Douro NaN NaN\n",
|
||
"10 US California Napa Valley Napa\n",
|
||
"100 US New York Finger Lakes Finger Lakes"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"df = pd.DataFrame(data, columns = columns, index = index) # Создаем ДатаФрейм (в качестве параметров передаем называние столбцов, индексы и сами данные)\n",
|
||
"df # Отображаем наш ДатаФрейм (лучше без использования функции print())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {
|
||
"id": "TIJhU5vEhkwv"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"## Загрузка и запись данных"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "CjIlX-Ar6vd7"
|
||
},
|
||
"source": [
|
||
"\n",
|
||
"- Функции типа **pd.read_формат** и **pd.to_формат**\n",
|
||
"считывают и записывают данные соответственно. <br /> Полный список можно найти в документации:\n",
|
||
"https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html\n",
|
||
"\n",
|
||
"Научимся считывать данные в формате csv (comma separated value) функцией:\n",
|
||
"\n",
|
||
"- <a href=\"http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html#pandas.read_csv\"> pd.read_csv()</a>: \n",
|
||
"\n",
|
||
"Аргументов у нее очень много, критически важные:\n",
|
||
" - **filepath_or_buffer** - текстовая строка с названием (адресом) файла\n",
|
||
" - **sep** - разделитель между данными\n",
|
||
" - **header** - номер строки, в которой в файле указаны названия столбцов, None, если нет\n",
|
||
" - **names** - список с названиями колонок\n",
|
||
" - **index_col** - или номер столбца, или список, или ничего - колонка, из которой надо взять названия строк"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {
|
||
"id": "mWdKBTMNhkwx"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data = pd.read_csv('water_potability.csv') # С помощью метода read_csv загружаем файл wine_base.csv и записываем данные в data"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "g8zunGkmhkw2"
|
||
},
|
||
"source": [
|
||
"**Смотрим, что загрузилось**\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "slhGLHJNhkw4",
|
||
"outputId": "58af12df-d33f-4a2a-e3f6-5a763ba68831"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>ph</th>\n",
|
||
" <th>Hardness</th>\n",
|
||
" <th>Solids</th>\n",
|
||
" <th>Chloramines</th>\n",
|
||
" <th>Sulfate</th>\n",
|
||
" <th>Conductivity</th>\n",
|
||
" <th>Organic carbon</th>\n",
|
||
" <th>Trihalomethanes</th>\n",
|
||
" <th>Turbidity</th>\n",
|
||
" <th>Potability</th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>3271</th>\n",
|
||
" <td>4.668102</td>\n",
|
||
" <td>193.681735</td>\n",
|
||
" <td>47580.991603</td>\n",
|
||
" <td>7.166639</td>\n",
|
||
" <td>359.948574</td>\n",
|
||
" <td>526.424171</td>\n",
|
||
" <td>13.894419</td>\n",
|
||
" <td>66.687695</td>\n",
|
||
" <td>4.435821</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3272</th>\n",
|
||
" <td>7.808856</td>\n",
|
||
" <td>193.553212</td>\n",
|
||
" <td>17329.802160</td>\n",
|
||
" <td>8.061362</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>392.449580</td>\n",
|
||
" <td>19.903225</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>2.798243</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3273</th>\n",
|
||
" <td>9.419510</td>\n",
|
||
" <td>175.762646</td>\n",
|
||
" <td>33155.578218</td>\n",
|
||
" <td>7.350233</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>432.044783</td>\n",
|
||
" <td>11.039070</td>\n",
|
||
" <td>69.845400</td>\n",
|
||
" <td>3.298875</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3274</th>\n",
|
||
" <td>5.126763</td>\n",
|
||
" <td>230.603758</td>\n",
|
||
" <td>11983.869376</td>\n",
|
||
" <td>6.303357</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>402.883113</td>\n",
|
||
" <td>11.168946</td>\n",
|
||
" <td>77.488213</td>\n",
|
||
" <td>4.708658</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3275</th>\n",
|
||
" <td>7.874671</td>\n",
|
||
" <td>195.102299</td>\n",
|
||
" <td>17404.177061</td>\n",
|
||
" <td>7.509306</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>327.459760</td>\n",
|
||
" <td>16.140368</td>\n",
|
||
" <td>78.698446</td>\n",
|
||
" <td>2.309149</td>\n",
|
||
" <td>1</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" ph Hardness Solids Chloramines Sulfate \\\n",
|
||
"3271 4.668102 193.681735 47580.991603 7.166639 359.948574 \n",
|
||
"3272 7.808856 193.553212 17329.802160 8.061362 NaN \n",
|
||
"3273 9.419510 175.762646 33155.578218 7.350233 NaN \n",
|
||
"3274 5.126763 230.603758 11983.869376 6.303357 NaN \n",
|
||
"3275 7.874671 195.102299 17404.177061 7.509306 NaN \n",
|
||
"\n",
|
||
" Conductivity Organic carbon Trihalomethanes Turbidity Potability \n",
|
||
"3271 526.424171 13.894419 66.687695 4.435821 1 \n",
|
||
"3272 392.449580 19.903225 NaN 2.798243 1 \n",
|
||
"3273 432.044783 11.039070 69.845400 3.298875 1 \n",
|
||
"3274 402.883113 11.168946 77.488213 4.708658 1 \n",
|
||
"3275 327.459760 16.140368 78.698446 2.309149 1 "
|
||
]
|
||
},
|
||
"execution_count": 8,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.tail() # С помощью метода head выводим первые 5 строк нашего ДатаФрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "TAEcUwXohkw9"
|
||
},
|
||
"source": [
|
||
"Что-то не то с первым столбцом, немного поправим"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"id": "UQ_ne0wIhkw-"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data = pd.read_csv('water_potability.csv', index_col = 0) # В параметре index_col указываем столбец, который будет использоваться как индекс нашего датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 924
|
||
},
|
||
"id": "u5iBpJ0jhkxC",
|
||
"outputId": "b8c9ab01-2747-467a-e833-870c9d83d11b",
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Hardness</th>\n",
|
||
" <th>Solids</th>\n",
|
||
" <th>Chloramines</th>\n",
|
||
" <th>Sulfate</th>\n",
|
||
" <th>Conductivity</th>\n",
|
||
" <th>Organic carbon</th>\n",
|
||
" <th>Trihalomethanes</th>\n",
|
||
" <th>Turbidity</th>\n",
|
||
" <th>Potability</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ph</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>NaN</th>\n",
|
||
" <td>204.890455</td>\n",
|
||
" <td>20791.318981</td>\n",
|
||
" <td>7.300212</td>\n",
|
||
" <td>368.516441</td>\n",
|
||
" <td>564.308654</td>\n",
|
||
" <td>10.379783</td>\n",
|
||
" <td>86.990970</td>\n",
|
||
" <td>2.963135</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3.716080</th>\n",
|
||
" <td>129.422921</td>\n",
|
||
" <td>18630.057858</td>\n",
|
||
" <td>6.635246</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>592.885359</td>\n",
|
||
" <td>15.180013</td>\n",
|
||
" <td>56.329076</td>\n",
|
||
" <td>4.500656</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.099124</th>\n",
|
||
" <td>224.236259</td>\n",
|
||
" <td>19909.541732</td>\n",
|
||
" <td>9.275884</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>418.606213</td>\n",
|
||
" <td>16.868637</td>\n",
|
||
" <td>66.420093</td>\n",
|
||
" <td>3.055934</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.316766</th>\n",
|
||
" <td>214.373394</td>\n",
|
||
" <td>22018.417441</td>\n",
|
||
" <td>8.059332</td>\n",
|
||
" <td>356.886136</td>\n",
|
||
" <td>363.266516</td>\n",
|
||
" <td>18.436524</td>\n",
|
||
" <td>100.341674</td>\n",
|
||
" <td>4.628771</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9.092223</th>\n",
|
||
" <td>181.101509</td>\n",
|
||
" <td>17978.986339</td>\n",
|
||
" <td>6.546600</td>\n",
|
||
" <td>310.135738</td>\n",
|
||
" <td>398.410813</td>\n",
|
||
" <td>11.558279</td>\n",
|
||
" <td>31.997993</td>\n",
|
||
" <td>4.075075</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5.584087</th>\n",
|
||
" <td>188.313324</td>\n",
|
||
" <td>28748.687739</td>\n",
|
||
" <td>7.544869</td>\n",
|
||
" <td>326.678363</td>\n",
|
||
" <td>280.467916</td>\n",
|
||
" <td>8.399735</td>\n",
|
||
" <td>54.917862</td>\n",
|
||
" <td>2.559708</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10.223862</th>\n",
|
||
" <td>248.071735</td>\n",
|
||
" <td>28749.716544</td>\n",
|
||
" <td>7.513408</td>\n",
|
||
" <td>393.663396</td>\n",
|
||
" <td>283.651634</td>\n",
|
||
" <td>13.789695</td>\n",
|
||
" <td>84.603556</td>\n",
|
||
" <td>2.672989</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.635849</th>\n",
|
||
" <td>203.361523</td>\n",
|
||
" <td>13672.091764</td>\n",
|
||
" <td>4.563009</td>\n",
|
||
" <td>303.309771</td>\n",
|
||
" <td>474.607645</td>\n",
|
||
" <td>12.363817</td>\n",
|
||
" <td>62.798309</td>\n",
|
||
" <td>4.401425</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>NaN</th>\n",
|
||
" <td>118.988579</td>\n",
|
||
" <td>14285.583854</td>\n",
|
||
" <td>7.804174</td>\n",
|
||
" <td>268.646941</td>\n",
|
||
" <td>389.375566</td>\n",
|
||
" <td>12.706049</td>\n",
|
||
" <td>53.928846</td>\n",
|
||
" <td>3.595017</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11.180284</th>\n",
|
||
" <td>227.231469</td>\n",
|
||
" <td>25484.508491</td>\n",
|
||
" <td>9.077200</td>\n",
|
||
" <td>404.041635</td>\n",
|
||
" <td>563.885481</td>\n",
|
||
" <td>17.927806</td>\n",
|
||
" <td>71.976601</td>\n",
|
||
" <td>4.370562</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7.360640</th>\n",
|
||
" <td>165.520797</td>\n",
|
||
" <td>32452.614409</td>\n",
|
||
" <td>7.550701</td>\n",
|
||
" <td>326.624353</td>\n",
|
||
" <td>425.383419</td>\n",
|
||
" <td>15.586810</td>\n",
|
||
" <td>78.740016</td>\n",
|
||
" <td>3.662292</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7.974522</th>\n",
|
||
" <td>218.693300</td>\n",
|
||
" <td>18767.656682</td>\n",
|
||
" <td>8.110385</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>364.098230</td>\n",
|
||
" <td>14.525746</td>\n",
|
||
" <td>76.485911</td>\n",
|
||
" <td>4.011718</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7.119824</th>\n",
|
||
" <td>156.704993</td>\n",
|
||
" <td>18730.813653</td>\n",
|
||
" <td>3.606036</td>\n",
|
||
" <td>282.344050</td>\n",
|
||
" <td>347.715027</td>\n",
|
||
" <td>15.929536</td>\n",
|
||
" <td>79.500778</td>\n",
|
||
" <td>3.445756</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>NaN</th>\n",
|
||
" <td>150.174923</td>\n",
|
||
" <td>27331.361962</td>\n",
|
||
" <td>6.838223</td>\n",
|
||
" <td>299.415781</td>\n",
|
||
" <td>379.761835</td>\n",
|
||
" <td>19.370807</td>\n",
|
||
" <td>76.509996</td>\n",
|
||
" <td>4.413974</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7.496232</th>\n",
|
||
" <td>205.344982</td>\n",
|
||
" <td>28388.004887</td>\n",
|
||
" <td>5.072558</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>444.645352</td>\n",
|
||
" <td>13.228311</td>\n",
|
||
" <td>70.300213</td>\n",
|
||
" <td>4.777382</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>6.347272</th>\n",
|
||
" <td>186.732881</td>\n",
|
||
" <td>41065.234765</td>\n",
|
||
" <td>9.629596</td>\n",
|
||
" <td>364.487687</td>\n",
|
||
" <td>516.743282</td>\n",
|
||
" <td>11.539781</td>\n",
|
||
" <td>75.071617</td>\n",
|
||
" <td>4.376348</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7.051786</th>\n",
|
||
" <td>211.049406</td>\n",
|
||
" <td>30980.600787</td>\n",
|
||
" <td>10.094796</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>315.141267</td>\n",
|
||
" <td>20.397022</td>\n",
|
||
" <td>56.651604</td>\n",
|
||
" <td>4.268429</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9.181560</th>\n",
|
||
" <td>273.813807</td>\n",
|
||
" <td>24041.326280</td>\n",
|
||
" <td>6.904990</td>\n",
|
||
" <td>398.350517</td>\n",
|
||
" <td>477.974642</td>\n",
|
||
" <td>13.387341</td>\n",
|
||
" <td>71.457362</td>\n",
|
||
" <td>4.503661</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.975464</th>\n",
|
||
" <td>279.357167</td>\n",
|
||
" <td>19460.398131</td>\n",
|
||
" <td>6.204321</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>431.443990</td>\n",
|
||
" <td>12.888759</td>\n",
|
||
" <td>63.821237</td>\n",
|
||
" <td>2.436086</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>7.371050</th>\n",
|
||
" <td>214.496610</td>\n",
|
||
" <td>25630.320037</td>\n",
|
||
" <td>4.432669</td>\n",
|
||
" <td>335.754439</td>\n",
|
||
" <td>469.914551</td>\n",
|
||
" <td>12.509164</td>\n",
|
||
" <td>62.797277</td>\n",
|
||
" <td>2.560299</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Hardness Solids Chloramines Sulfate Conductivity \\\n",
|
||
"ph \n",
|
||
"NaN 204.890455 20791.318981 7.300212 368.516441 564.308654 \n",
|
||
"3.716080 129.422921 18630.057858 6.635246 NaN 592.885359 \n",
|
||
"8.099124 224.236259 19909.541732 9.275884 NaN 418.606213 \n",
|
||
"8.316766 214.373394 22018.417441 8.059332 356.886136 363.266516 \n",
|
||
"9.092223 181.101509 17978.986339 6.546600 310.135738 398.410813 \n",
|
||
"5.584087 188.313324 28748.687739 7.544869 326.678363 280.467916 \n",
|
||
"10.223862 248.071735 28749.716544 7.513408 393.663396 283.651634 \n",
|
||
"8.635849 203.361523 13672.091764 4.563009 303.309771 474.607645 \n",
|
||
"NaN 118.988579 14285.583854 7.804174 268.646941 389.375566 \n",
|
||
"11.180284 227.231469 25484.508491 9.077200 404.041635 563.885481 \n",
|
||
"7.360640 165.520797 32452.614409 7.550701 326.624353 425.383419 \n",
|
||
"7.974522 218.693300 18767.656682 8.110385 NaN 364.098230 \n",
|
||
"7.119824 156.704993 18730.813653 3.606036 282.344050 347.715027 \n",
|
||
"NaN 150.174923 27331.361962 6.838223 299.415781 379.761835 \n",
|
||
"7.496232 205.344982 28388.004887 5.072558 NaN 444.645352 \n",
|
||
"6.347272 186.732881 41065.234765 9.629596 364.487687 516.743282 \n",
|
||
"7.051786 211.049406 30980.600787 10.094796 NaN 315.141267 \n",
|
||
"9.181560 273.813807 24041.326280 6.904990 398.350517 477.974642 \n",
|
||
"8.975464 279.357167 19460.398131 6.204321 NaN 431.443990 \n",
|
||
"7.371050 214.496610 25630.320037 4.432669 335.754439 469.914551 \n",
|
||
"\n",
|
||
" Organic carbon Trihalomethanes Turbidity Potability \n",
|
||
"ph \n",
|
||
"NaN 10.379783 86.990970 2.963135 0 \n",
|
||
"3.716080 15.180013 56.329076 4.500656 0 \n",
|
||
"8.099124 16.868637 66.420093 3.055934 0 \n",
|
||
"8.316766 18.436524 100.341674 4.628771 0 \n",
|
||
"9.092223 11.558279 31.997993 4.075075 0 \n",
|
||
"5.584087 8.399735 54.917862 2.559708 0 \n",
|
||
"10.223862 13.789695 84.603556 2.672989 0 \n",
|
||
"8.635849 12.363817 62.798309 4.401425 0 \n",
|
||
"NaN 12.706049 53.928846 3.595017 0 \n",
|
||
"11.180284 17.927806 71.976601 4.370562 0 \n",
|
||
"7.360640 15.586810 78.740016 3.662292 0 \n",
|
||
"7.974522 14.525746 76.485911 4.011718 0 \n",
|
||
"7.119824 15.929536 79.500778 3.445756 0 \n",
|
||
"NaN 19.370807 76.509996 4.413974 0 \n",
|
||
"7.496232 13.228311 70.300213 4.777382 0 \n",
|
||
"6.347272 11.539781 75.071617 4.376348 0 \n",
|
||
"7.051786 20.397022 56.651604 4.268429 0 \n",
|
||
"9.181560 13.387341 71.457362 4.503661 0 \n",
|
||
"8.975464 12.888759 63.821237 2.436086 0 \n",
|
||
"7.371050 12.509164 62.797277 2.560299 0 "
|
||
]
|
||
},
|
||
"execution_count": 10,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.head(20) # С помощью метода head выводим первые 20 строк нашего ДатаФрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "UnU4xLLzhkxG"
|
||
},
|
||
"source": [
|
||
"**Информация о загруженных данных**:\n",
|
||
"\n",
|
||
"- Посчитаем, сколько записей\n",
|
||
"- Посмотрим, какого типа данные\n",
|
||
"- Проверим, есть ли пропуски"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 34
|
||
},
|
||
"id": "Z-MKWiELhkxP",
|
||
"outputId": "68ca424e-83eb-4d17-d779-1ea7f1a6b4ae"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"(3276, 9)"
|
||
]
|
||
},
|
||
"execution_count": 11,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.shape # Параметр .shape (так же как и в numpy-массивах) показывает размерность нашего датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 34
|
||
},
|
||
"id": "SEr52zb4hkxT",
|
||
"outputId": "d72fe356-c89d-4d61-c62b-a64359f748d2"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"29484"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.size # Параметр .size (так же как и в numpy-массивах) показывает количество элементов в нашем датафрейме"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "sw8ATDX1hkxJ",
|
||
"outputId": "2cfce98c-00bf-4093-8bf5-6573e0cd909f",
|
||
"scrolled": true
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Hardness 3276\n",
|
||
"Solids 3276\n",
|
||
"Chloramines 3276\n",
|
||
"Sulfate 2495\n",
|
||
"Conductivity 3276\n",
|
||
"Organic carbon 3276\n",
|
||
"Trihalomethanes 3114\n",
|
||
"Turbidity 3276\n",
|
||
"Potability 3276\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.count() # Метод count считает сколько всего непустых записей в каждом столбце"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "OwVkE1MKX6mW",
|
||
"outputId": "9a9f1142-de9c-4ffa-e051-bb58af728151"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Hardness 100\n",
|
||
"Solids 100\n",
|
||
"Chloramines 100\n",
|
||
"Sulfate 74\n",
|
||
"Conductivity 100\n",
|
||
"Organic carbon 100\n",
|
||
"Trihalomethanes 98\n",
|
||
"Turbidity 100\n",
|
||
"Potability 100\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.head(100).count() # Применим метод .count() к первым ста записям нашего датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "5AjpVmYFhkxX"
|
||
},
|
||
"source": [
|
||
"- Метод info() заодно показывает, какого типа данные в столбцах"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 306
|
||
},
|
||
"id": "G8RHx3kvhkxZ",
|
||
"outputId": "cf46dd23-3acf-4d2e-c8c4-046fa7b1f8d6"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"<class 'pandas.core.frame.DataFrame'>\n",
|
||
"Index: 3276 entries, nan to 7.87467135779128\n",
|
||
"Data columns (total 9 columns):\n",
|
||
" # Column Non-Null Count Dtype \n",
|
||
"--- ------ -------------- ----- \n",
|
||
" 0 Hardness 3276 non-null float64\n",
|
||
" 1 Solids 3276 non-null float64\n",
|
||
" 2 Chloramines 3276 non-null float64\n",
|
||
" 3 Sulfate 2495 non-null float64\n",
|
||
" 4 Conductivity 3276 non-null float64\n",
|
||
" 5 Organic carbon 3276 non-null float64\n",
|
||
" 6 Trihalomethanes 3114 non-null float64\n",
|
||
" 7 Turbidity 3276 non-null float64\n",
|
||
" 8 Potability 3276 non-null int64 \n",
|
||
"dtypes: float64(8), int64(1)\n",
|
||
"memory usage: 255.9 KB\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"data.info() # Метод .info() показывает тип каждого столбца и занимаемую память"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "cMRzYwQdhkxd",
|
||
"outputId": "4afa01d4-ec65-452a-fc89-c503253c1efa"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Hardness float64\n",
|
||
"Solids float64\n",
|
||
"Chloramines float64\n",
|
||
"Sulfate float64\n",
|
||
"Conductivity float64\n",
|
||
"Organic carbon float64\n",
|
||
"Trihalomethanes float64\n",
|
||
"Turbidity float64\n",
|
||
"Potability int64\n",
|
||
"dtype: object"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.dtypes # Параметр .dtypes показывает просто тип каждого столбца"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "S3TniwKUhkxh"
|
||
},
|
||
"source": [
|
||
"Начнем проверять на пропуски! \n",
|
||
"\n",
|
||
"- .isnull() - выдает табличку, где False - ячейка заполнена, True - ячейка пуста :( Ближайшая родня - isna()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "uq1iywLbYsxS",
|
||
"outputId": "fcc31e0d-6e49-4967-ac34-d03865227b1f"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Hardness</th>\n",
|
||
" <th>Solids</th>\n",
|
||
" <th>Chloramines</th>\n",
|
||
" <th>Sulfate</th>\n",
|
||
" <th>Conductivity</th>\n",
|
||
" <th>Organic carbon</th>\n",
|
||
" <th>Trihalomethanes</th>\n",
|
||
" <th>Turbidity</th>\n",
|
||
" <th>Potability</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ph</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>NaN</th>\n",
|
||
" <td>204.890455</td>\n",
|
||
" <td>20791.318981</td>\n",
|
||
" <td>7.300212</td>\n",
|
||
" <td>368.516441</td>\n",
|
||
" <td>564.308654</td>\n",
|
||
" <td>10.379783</td>\n",
|
||
" <td>86.990970</td>\n",
|
||
" <td>2.963135</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3.716080</th>\n",
|
||
" <td>129.422921</td>\n",
|
||
" <td>18630.057858</td>\n",
|
||
" <td>6.635246</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>592.885359</td>\n",
|
||
" <td>15.180013</td>\n",
|
||
" <td>56.329076</td>\n",
|
||
" <td>4.500656</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.099124</th>\n",
|
||
" <td>224.236259</td>\n",
|
||
" <td>19909.541732</td>\n",
|
||
" <td>9.275884</td>\n",
|
||
" <td>NaN</td>\n",
|
||
" <td>418.606213</td>\n",
|
||
" <td>16.868637</td>\n",
|
||
" <td>66.420093</td>\n",
|
||
" <td>3.055934</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.316766</th>\n",
|
||
" <td>214.373394</td>\n",
|
||
" <td>22018.417441</td>\n",
|
||
" <td>8.059332</td>\n",
|
||
" <td>356.886136</td>\n",
|
||
" <td>363.266516</td>\n",
|
||
" <td>18.436524</td>\n",
|
||
" <td>100.341674</td>\n",
|
||
" <td>4.628771</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9.092223</th>\n",
|
||
" <td>181.101509</td>\n",
|
||
" <td>17978.986339</td>\n",
|
||
" <td>6.546600</td>\n",
|
||
" <td>310.135738</td>\n",
|
||
" <td>398.410813</td>\n",
|
||
" <td>11.558279</td>\n",
|
||
" <td>31.997993</td>\n",
|
||
" <td>4.075075</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Hardness Solids Chloramines Sulfate Conductivity \\\n",
|
||
"ph \n",
|
||
"NaN 204.890455 20791.318981 7.300212 368.516441 564.308654 \n",
|
||
"3.716080 129.422921 18630.057858 6.635246 NaN 592.885359 \n",
|
||
"8.099124 224.236259 19909.541732 9.275884 NaN 418.606213 \n",
|
||
"8.316766 214.373394 22018.417441 8.059332 356.886136 363.266516 \n",
|
||
"9.092223 181.101509 17978.986339 6.546600 310.135738 398.410813 \n",
|
||
"\n",
|
||
" Organic carbon Trihalomethanes Turbidity Potability \n",
|
||
"ph \n",
|
||
"NaN 10.379783 86.990970 2.963135 0 \n",
|
||
"3.716080 15.180013 56.329076 4.500656 0 \n",
|
||
"8.099124 16.868637 66.420093 3.055934 0 \n",
|
||
"8.316766 18.436524 100.341674 4.628771 0 \n",
|
||
"9.092223 11.558279 31.997993 4.075075 0 "
|
||
]
|
||
},
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.head() # Отобразим первые 5 строк нашего датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "_oxBR6lAzfgu",
|
||
"outputId": "1b2f600d-ea50-4289-cfd6-976ea1526877"
|
||
},
|
||
"outputs": [],
|
||
"source": []
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "ZjTn7cM5zyta",
|
||
"outputId": "b3889e78-08c3-4bdf-c4ec-260d35eed9ea"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Hardness 0\n",
|
||
"Solids 0\n",
|
||
"Chloramines 0\n",
|
||
"Sulfate 781\n",
|
||
"Conductivity 0\n",
|
||
"Organic carbon 0\n",
|
||
"Trihalomethanes 162\n",
|
||
"Turbidity 0\n",
|
||
"Potability 0\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 18,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.isna().sum() # Подсчитаем количество пропусков в каждом столбце с помощью метода .sum()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "D7aHOhOGY5pe",
|
||
"outputId": "ce904bd7-3087-40a2-824e-7f7047f358db"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Hardness 0\n",
|
||
"Solids 0\n",
|
||
"Chloramines 0\n",
|
||
"Sulfate 26\n",
|
||
"Conductivity 0\n",
|
||
"Organic carbon 0\n",
|
||
"Trihalomethanes 2\n",
|
||
"Turbidity 0\n",
|
||
"Potability 0\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.head(100).isna().sum() # Подсчитаем количество пропусков в каждом столбце для первых ста записей"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "ZaiGPw-KY9eQ",
|
||
"outputId": "3ffbc798-c5bc-4970-c6c3-1608da30afe4"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Hardness 0\n",
|
||
"Solids 0\n",
|
||
"Chloramines 0\n",
|
||
"Sulfate 26\n",
|
||
"Conductivity 0\n",
|
||
"Organic carbon 0\n",
|
||
"Trihalomethanes 2\n",
|
||
"Turbidity 0\n",
|
||
"Potability 0\n",
|
||
"dtype: int64"
|
||
]
|
||
},
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.isna().head(100).sum() # Подсчитаем количество пропусков в каждом столбце для первых ста записей (равнозначно предыдущей записи)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "NHeX2czDhkxi",
|
||
"outputId": "9995758b-2f88-47ca-ab63-37cb364875dd"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"Hardness 0.000000\n",
|
||
"Solids 0.000000\n",
|
||
"Chloramines 0.000000\n",
|
||
"Sulfate 0.238400\n",
|
||
"Conductivity 0.000000\n",
|
||
"Organic carbon 0.000000\n",
|
||
"Trihalomethanes 0.049451\n",
|
||
"Turbidity 0.000000\n",
|
||
"Potability 0.000000\n",
|
||
"dtype: float64"
|
||
]
|
||
},
|
||
"execution_count": 21,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"round(data.isna().sum() / data.shape[0], 6) # Посчитаем какую часть составляют пропуски от общего количества элементов"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 34
|
||
},
|
||
"id": "tvAQTignhkxo",
|
||
"outputId": "7d223855-7d97-4529-bb74-572e21ed89a2"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"943\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"proc = data.isna().sum().sum() # Подсчитаем сколько всего пропусков (во всех столбцах) в нашем датафрейме\n",
|
||
"print(proc) # Отобразим количество посчитанных пропусков"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 34
|
||
},
|
||
"id": "EOZz-GAPhkxr",
|
||
"outputId": "b7997cfd-a292-48ff-d431-ff425d210a7c"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"3.2%\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# Переведем полученное значение в процентное отображение\n",
|
||
"proc = data.isna().sum().sum() / data.size\n",
|
||
"print(round(100*proc,1), '%', sep='')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {
|
||
"id": "OuW1gRtlhkxz"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"### Как оценить пропуски визуально"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "E8w2yGJ1hkx6"
|
||
},
|
||
"source": [
|
||
"Что с ним делать?\n",
|
||
"\n",
|
||
"Выбора не очень много: <br>\n",
|
||
"\n",
|
||
"1) Удалять: \n",
|
||
"- dropna(axis=0, how='any'): axis = 0 - удаляем построчно, axis = 1 выкидываем столбец; how ='any' - выкидываем, если есть хотя бы одна ячейка пустая. how = 'all' - выкидываем, если есть полностью пустая строка или столбец\n",
|
||
"\n",
|
||
"2) Вставлять информацию самим:\n",
|
||
"- fillna() - это отдельное искусство, как заполнять. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Requirement already satisfied: matplotlib in ./venv/lib/python3.13/site-packages (3.10.0)\n",
|
||
"Requirement already satisfied: contourpy>=1.0.1 in ./venv/lib/python3.13/site-packages (from matplotlib) (1.3.1)\n",
|
||
"Requirement already satisfied: cycler>=0.10 in ./venv/lib/python3.13/site-packages (from matplotlib) (0.12.1)\n",
|
||
"Requirement already satisfied: fonttools>=4.22.0 in ./venv/lib/python3.13/site-packages (from matplotlib) (4.56.0)\n",
|
||
"Requirement already satisfied: kiwisolver>=1.3.1 in ./venv/lib/python3.13/site-packages (from matplotlib) (1.4.8)\n",
|
||
"Requirement already satisfied: numpy>=1.23 in ./venv/lib/python3.13/site-packages (from matplotlib) (2.2.3)\n",
|
||
"Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.13/site-packages (from matplotlib) (24.2)\n",
|
||
"Requirement already satisfied: pillow>=8 in ./venv/lib/python3.13/site-packages (from matplotlib) (11.1.0)\n",
|
||
"Requirement already satisfied: pyparsing>=2.3.1 in ./venv/lib/python3.13/site-packages (from matplotlib) (3.2.1)\n",
|
||
"Requirement already satisfied: python-dateutil>=2.7 in ./venv/lib/python3.13/site-packages (from matplotlib) (2.9.0.post0)\n",
|
||
"Requirement already satisfied: six>=1.5 in ./venv/lib/python3.13/site-packages (from python-dateutil>=2.7->matplotlib) (1.17.0)\n",
|
||
"\n",
|
||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.0.1\u001b[0m\n",
|
||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
|
||
"Requirement already satisfied: seaborn in ./venv/lib/python3.13/site-packages (0.13.2)\n",
|
||
"Requirement already satisfied: numpy!=1.24.0,>=1.20 in ./venv/lib/python3.13/site-packages (from seaborn) (2.2.3)\n",
|
||
"Requirement already satisfied: pandas>=1.2 in ./venv/lib/python3.13/site-packages (from seaborn) (2.2.3)\n",
|
||
"Requirement already satisfied: matplotlib!=3.6.1,>=3.4 in ./venv/lib/python3.13/site-packages (from seaborn) (3.10.0)\n",
|
||
"Requirement already satisfied: contourpy>=1.0.1 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.3.1)\n",
|
||
"Requirement already satisfied: cycler>=0.10 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (0.12.1)\n",
|
||
"Requirement already satisfied: fonttools>=4.22.0 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (4.56.0)\n",
|
||
"Requirement already satisfied: kiwisolver>=1.3.1 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (1.4.8)\n",
|
||
"Requirement already satisfied: packaging>=20.0 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (24.2)\n",
|
||
"Requirement already satisfied: pillow>=8 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (11.1.0)\n",
|
||
"Requirement already satisfied: pyparsing>=2.3.1 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (3.2.1)\n",
|
||
"Requirement already satisfied: python-dateutil>=2.7 in ./venv/lib/python3.13/site-packages (from matplotlib!=3.6.1,>=3.4->seaborn) (2.9.0.post0)\n",
|
||
"Requirement already satisfied: pytz>=2020.1 in ./venv/lib/python3.13/site-packages (from pandas>=1.2->seaborn) (2025.1)\n",
|
||
"Requirement already satisfied: tzdata>=2022.7 in ./venv/lib/python3.13/site-packages (from pandas>=1.2->seaborn) (2025.1)\n",
|
||
"Requirement already satisfied: six>=1.5 in ./venv/lib/python3.13/site-packages (from python-dateutil>=2.7->matplotlib!=3.6.1,>=3.4->seaborn) (1.17.0)\n",
|
||
"\n",
|
||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m24.3.1\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m25.0.1\u001b[0m\n",
|
||
"\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"!pip install matplotlib\n",
|
||
"!pip install seaborn"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 735
|
||
},
|
||
"id": "ToPE3VkWhkx1",
|
||
"outputId": "6a5c7213-1a94-4823-e5cf-3d19468a890d"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAABjUAAAPHCAYAAABg4cu8AAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjAsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvlHJYcgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAUtNJREFUeJzt3XuU3uO5P/5rkogcJkGIxNjJpCEIjSHstqjjlg5ahx5UUaQlbR2rxbJ869DSog5RKmip6NYKVXTTRcehlMYpSoZoTiIR7Thmsyuq4nD//ujK88vkNM8Tidudeb3WspaZeeaZeyaf+Tyfud6f67rrUkopAAAAAAAAPuK65F4AAAAAAABANYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEbrl+sKjuuyf60sDAAAAAAAfMXe9f2OHj9GpAQAAAAAAFEGoAQAAAAAAFCHb+CkAgJa21txLgKo1NzTlXgIAAECnp1MDAAAAAAAoglADAAAAAAAogvFTAEA2xvkAAAAAtdCpAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFEGoAQAAAAAAFKFb7gUAAJ1XS1tr7iVA1ZobmnIvAQAAoNMTagAA2SgSAwAAALUwfgoAAAAAACiCTg0AIBvjpyiJziIAAID8hBoAQDaKxAAAAEAtjJ8CAAAAAACKoFMDAMjG+ClKorMIAAAgP6EGAJCNIjEAAABQC6EGAJCNTg1KIoQDAADIT6gBAGSjSAwAAADUwkbhAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEeypAQAAAGTV0taaewlQNfvCAeQl1AAAslHAoCQKGACrjnMsAFAtoQYAkI0CBgAAAFALoQYAkI1ODUoihAMAAMhPqAEAZKNIDAAAANRCqAEAZKNTg5II4QAAAPITagAA2SgSAwAAALXoknsBAAAAAAAA1RBqAAAAAAAARRBqAAAAAAAARbCnBgCQjY3CKYk9YAAAAPLTqQEAAAAAABRBpwYAkI073wEAAIBa6NQAAAAAAACKoFMDAMjGnhqURGcRAABAfkINACAbRWIAAACgFsZPAQAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAAReiWewEAAABA59bS1pp7CVC15oam3EsA6NSEGgAAAEBWisQAQLWMnwIAAAAAAIqgUwMAyMaoCUriLmIAAID8hBoAQDaKxAAAAEAtjJ8CAAAAAACKINQAAAAAAACKINQAAAAAAACKYE8NACAbG4VTEnvAAAAA5KdTAwAAAAAAKIJODQAgG3e+AwAAALXQqQEAAAAAABRBqAEAAAAAABRBqAEAAAAAABRBqAEAAAAAABRBqAEAAAAAABRBqAEAAAAAABShW+4FAACdV0tba+4lQNWaG5pyLwEAAKDTE2oAANkoEgMAAAC1EGoAANno1KAkQjgAAID8hBoAQDaKxAAAAEAtbBQOAAAAAAAUQacGAAAAkJWRlJREtzFAXkINAAAAICtFYgCgWsZPAQAAAAAARRBqAAAAAAAARTB+CgDIxvxsSmI0CgAAQH5CDQAgG0ViAAAAoBbGTwEAAAAAAEXQqQEAZGP8FCXRWQQAAJCfUAMAyEaRGAAAAKiFUAMAyEanBiURwgEAAORnTw0AAAAAAKAIOjUAgGzc+Q4AAADUQqgBAGRj/BQlEcIBAADkJ9QAALJRJAYAAABqYU8NAAAAAACgCEINAAAAAACgCEINAAAAAACgCEINAAAAAACgCDYKBwCyaWlrzb0EqJqN7QEAAPLTqQEAAAAAABRBpwYAkI073wEAAIBaCDUAgGyMn6IkQjgAAID8hBoAQDaKxAAAAEAthBoAQDY6NSiJEA4AACA/oQYAkI0iMQAAAFCLLrkXAAAAAAAAUA2hBgAAAAAAUAShBgAAAAAAUAShBgAAAAAAUAShBgAAAAAAUIRuuRcAAHReLW2tuZcAVWtuaMq9BAAAgE5PpwYAAAAAAFAEnRoAQDbufAcAAABqoVMDAAAAAAAogk4NAAAAICv7bFES3cYAeQk1AIBsFDAoiQIGAABAfkINACAbRWIAIMI1AQBQPXtqAAAAAAAARRBqAAAAAAAARTB+CgAAAMjKPluUxLg0gLyEGgAAAEBWisQAQLWMnwIAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIrQLfcCAIDOq6WtNfcSoGrNDU25lwAAANDpCTUAgGwUiQEAAIBaGD8FAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUwZ4aAEA2NgqnJPaAAQAAyE+nBgAAAAAAUASdGgBANu58BwAAAGqhUwMAAAAAACiCTg0AIBt7alASnUUAAAD5CTUAgGwUiQEAAIBaGD8FAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUwZ4aAEA2NgqnJPaAAQAAyE+oAQBko0gMAAAA1ML4KQAAAAAAoAg6NQCAbIyfoiQ6iwAAAPITagAA2SgSAwAAALUQagAAAABZ6d6kJG7MAchLqAEAAABkpUgMAFTLRuEAAAAAAEARdGoAANkYNUFJ3EUMAACQn1ADAMhGkRgAAACohVADAMhGpwYlEcIBAADkJ9QAALJRJAYAAABqIdQAALLRqUFJhHAAAAD5dcm9AAAAAAAAgGro1AAAsnHnOwAAAFALnRoAAAAAAEARhBoAAAAAAEARhBoAAAAAAEARhBoAAAAAAEARbBQOAGTT0taaewlQNRvbAwAA5KdTAwAAAAAAKIJODQAgG3e+AwAAALXQqQEAAAAAABRBqAEAAAAAABTB+CkAIBsbhVMS49IAAADy06kBAAAAAAAUQacGAJCNO98BgAjdm5TFNSxAXkINAAAAICtFYgCgWkINAAAAICudGpRECAeQl1ADAAAAyEqRGACollADAMjGXZmURMENAAAgP6EGAJCNIjEAAABQC6EGAAAAkJXuTUrixhyAvIQaAEA2ChiURAEDAAAgvy65FwAAAAAAAFANnRoAQDbufAcAIlwTAADV06kBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUoVvuBQAAAACdW0tba+4lQNWaG5pyLwGgUxNqAAAAAFkpEgMA1TJ+CgAAAAAAKIJQAwAAAAAAKILxUwBANuZnUxKjUQAAAPITagAA2SgSAwAAALUwfgoAAAAAACiCTg0AIBvjpyiJziIAAID8dGoAAAAAAABFEGoAAAAAAABFEGoAAAAAAABFsKcGAJCNPQoAAACAWujUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAimCjcAAAACCrlrbW3EuAqjU3NOVeAkCnJtQAAAAAslIkBgCqZfwUAAAAAABQBJ0aAEA2Rk1QEncRAwAA5CfUAACyUSQGAAAAamH8FAAAAAAAUAShBgAAAAAAUATjpwCAbOypQUmMSwMAAMhPqAEAZKNIDAAAANTC+CkAAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAINgoHALJpaWvNvQSomo3tAQAA8hNqAADZKBIDAAAAtTB+CgAAAAAAKIJODQAAACArIykpiW5jgLyEGgAAAEBWisQAQLWMnwIAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIpgo3AAIJuWttbcS4Cq2cQWAAAgP50aAAAAAABAEXRqAADZuPMdAAAAqIVODQAAAAAAoAg6NQCAbOypQUl0FgEAAOQn1AAAslEkBgAAAGph/BQAAAAAAFAEoQYAAAAAAFAE46cAgGzsqUFJjEsDAADIT6gBAGSjSAwAAADUQqgBAGSjU4OSCOEAAADys6cGAAAAAABQBJ0aAEA27nwHAAAAaqFTAwAAAAAAKIJODQAgG3tqUBKdRQCrjmsCSuKaACAvoQYAkI0/CAGACNcEAED1jJ8CAAAAAACKoFMDAMjGqAlK4i5iAACA/IQaAEA2isQAAABALYQaAAAAQFa6NymJG3MA8hJqAAAAAFkpEgMA1bJROAAAAAAAUAShBgAAAAAAUAShBgAAAAAAUAShBgAAAAAAUAShBgAAAAAAUIRuuRcAAHReLW2tuZcAVWtuaMq9BAAAgE5PqAEAZKNIDAAAANRCqAEAZKNTg5II4QAAAPKzpwYAAAAAAFAEnRoAQDbufAcAAABqIdQAALIxfoqSCOEAVh3XBJTENQFAXkINACAbfxACABGuCQCA6tlTAwAAAAAAKIJQAwAAAAAAKIJQAwAAAAAAKII9NQAAAICsbBROSewBA5CXUAMAAADISpEYAKiW8VMAAAAAAEARdGoAANkYNUFJ3EUMAACQn1ADAMhGkRgAAACohVADAMhGpwYlEcIBAADkJ9QAALJRJAYAAABqIdQAALLRqUFJhHAAAAD5CTUAgGwUiQEAAIBadMm9AAAAAAAAgGro1AAAsjF+ipLoLAIAAMhPqAEAZKNIDAAAANTC+CkAAAAAAKAIQg0AAAAAAKAIxk8BANnYU4OSGJcGAACQn1ADAMhGkRgAAACohfFTAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEbrlXgAA0Hm1tLXmXgJUrbmhKfcSAAAAOj2dGgAAAAAAQBF0agAA2bjzHQAAAKiFTg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAI3XIvAADovFraWnMvAarW3NCUewkAAACdnlADAMhGkRgAAACohfFTAAAAAABAEYQaAAAAAABAEYyfAgCysacGJTEuDQAAID+hBgCQjSIxAAAAUAvjpwAAAAAAgCLo1AAAsjF+ipLoLAIAAMhPqAEAZKNIDAAAANTC+CkAAAAAAKAIQg0AAAAAAKAIxk8BANnYU4OSGJcGAACQn1ADAMhGkRgAAACohfFTAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEYQaAAAAAABAEbrlXgAA0Hm1tLXmXgJUrbmhKfcSAAAAOj2hBgCQjSIxAAAAUAuhBgCQjU4NSiKEAwAAyM+eGgAAAAAAQBF0agAA2bjzHQAAAKiFTg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAINgoHALJpaWvNvQSomo3tAQAA8hNqAADZKBIDAAAAtTB+CgAAAAAAKIJQAwAAAAAAKILxUwBANvbUoCTGpQEAAOQn1AAAslEkBgAAAGph/BQAAAAAAFAEoQYAAAAAAFAE46cAgGzsqUFJjEsDAADIT6cGAAAAAABQBJ0aAEA27nwHAAAAaiHUAACyMX6KkgjhAAAA8hNqAADZKBIDAAAAtRBqAAAAAFnp3qQkbswByEuoAQAAAGSlSAwAVKtL7gUAAAAAAABUQ6gBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUwUbhAEA2LW2tuZcAVbOJLQAAQH5CDQAgG0ViAAAAoBZCDQAAACAr3ZuUxI05AHkJNQAAAICsFIkBgGrZKBwAAAAAACiCTg0AIBujJiiJu4gBAADyE2oAANkoEgMAAAC1MH4KAAAAAAAoglADAAAAAAAoglADAAAAAAAoglADAAAAAAAoglADAAAAAAAoQrfcCwAAOq+WttbcS4CqNTc05V4CAABAp6dTAwAAAAAAKIJODQAgG3e+AwAAALXQqQEAAAAAABRBpwYAkI09NSiJziIAAID8hBoAQDaKxAAAAEAthBoAQDY6NSiJEA4AACA/e2oAAAAAAABF0KkBAGTjzncAIEL3JmVxDQuQl1ADAAAAyEqRGACollADAMjGXZmURMENAAAgP6EGAJCNIjEAAABQCxuFAwAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAAReiWewEAAABA59bS1pp7CVC15oam3EsA6NSEGgAAAEBWisQAQLWMnwIAAAAAAIqgUwMAyMaoCUriLmIAAID8hBoAQDaKxAAAAEAtjJ8CAAAAAACKINQAAAAAAACKYPwUAAAAkJV9tiiJEaoAeQk1AIBsFDAoiQIGwKrjHAsAVMv4KQAAAAAAoAg6NQCAbNyVCQAAANRCpwYAAAAAAFAEnRoAQDb21KAkOosAAADy06kBAAAAAAAUQacGAJCNO98BAACAWgg1AIBsjJ+iJEI4AACA/IQaAEA2isQAAABALeypAQAAAAAAFEGoAQAAAAAAFMH4KQAAACAr+2xREiNUAfISagAAAABZKRIDANUyfgoAAAAAACiCTg0AIBujJiiJu4gBAADyE2oAANkoEgMAAAC1EGoAANno1KAkQjgAAID87KkBAAAAAAAUQacGAJCNO98BAACAWujUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAiiDUAAAAAAAAitAt9wIAgM6rpa019xKgas0NTbmXAAAA0Onp1AAAAAAAAIog1AAAAAAAAIpg/BQAkI1xPgAAAEAthBoAQDb21KAkQjgAAID8jJ8CAAAAAACKoFMDAMjGne8AAABALYQaAEA2xk9REiEcAABAfkINACAbRWIAAACgFkINACAbnRqURAgHAACQn1ADAMhGkRgAAACoRZfcCwAAAAAAAKiGTg0AIBvjpyiJziIAAID8hBoAQDaKxAAAAEAthBoAQDY6NSiJEA4AACA/oQYAkI0iMQAAAFALG4UDAAAAAABF0KkBAGRj/BQl0VkEAACQn1ADAMhGkRgAAACohVADAMhGpwYlEcIBAADkJ9QAALJRJAYAItzoQFlcwwLkJdQAAAAAslIkBgCq1SX3AgAAAAAAAKqhUwMAyMaoCUriLmIAAID8hBoAQDaKxAAAAEAtjJ8CAAAAAACKINQAAAAAAACKINQAAAAAAACKYE8NACAbG4VTEnvAAAAA5KdTAwAAAAAAKIJODQAgG3e+AwAAALXQqQEAAAAAABRBpwYAkI09NSiJziIAAID8hBoAQDaKxAAAAEAtjJ8CAAAAAACKoFMDAMjG+ClKorMIAAAgP6EGAJCNIjEAAABQC6EGAJCNTg1KIoQDWHVcE1AS1wQAeQk1AIBs/EEIAES4JgAAqmejcAAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAjdci8AAAAA6Nxa2lpzLwGq1tzQlHsJAJ2aUAMAAADISpEYAKiW8VMAAAAAAEARVrhTY+bMmXHvvffGyy+/HO+//367j51++ukfeGEAAAAAAACLWqFQ48orr4wjjzwy1ltvvRg4cGDU1dVVPlZXVyfUAAAAAAAAVrq6lFKq9ZMaGxvjqKOOipNPPnmFv/CoLvuv8OcCAAAAAACrl7vev7HDx6zQnhqvvfZa7L+/UAIAAAAAAPjwrFCosf/++8edd965stcCAAAAAACwTFXvqXHJJZdU/n/jjTeO0047LR5++OEYMWJErLHGGu0ee9xxx628FQIAAAAAAEQNe2p87GMfq+4J6+ri2Wef7fBx9tQAAAAAAAAWqmZPjao7NWbPnr3U9y/MROrq6qp9KgAAAAAAgJpVHWos7he/+EVcdNFFMXPmzIiIGDZsWBx//PFxxBFHrLTFAQCrt5a21txLgKo1NzTlXgIAAECnt0Khxumnnx5jx46NY489NrbbbruIiHjooYfiO9/5TsydOzfOPPPMlbpIAGD1pEgMAAAA1KLqPTUW1b9//7jkkkviwAMPbPf+CRMmxLHHHhuvvvpqh89hTw0AAAAAAGChavbU6LIiT/zOO+/Etttuu8T7t9lmm3j33XdX5CkBAAAAAACWa4VCjUMOOSQuv/zyJd7/85//PA4++OAPvCgAAAAAAIDFfaCNwu+888741Kc+FRERjzzySMydOzcOPfTQ+O53v1t53NixYz/4KgEAAAAAgE5vhUKNKVOmxMiRIyMiYtasWRERsd5668V6660XU6ZMqTyurq5uJSwRAFhdtbS15l4CVM3G9gAAAPmt0EbhK4ONwgEAAAAAgIWq2Sh8hcdPAQB8UDo1KIlODQAAgPyEGgBANorEAAAAQC265F4AAAAAAABANYQaAAAAAABAEYQaAAAAAABAEeypAQBkY6NwSmIPGAAAgPyEGgBANorEAAAAQC2MnwIAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIrQLfcCAIDOq6WtNfcSoGrNDU25lwAAANDpCTUAgGwUiQEAAIBaGD8FAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUQagBAAAAAAAUoVvuBQAAnVdLW2vuJUDVmhuaci8BAACg0xNqAADZKBIDAAAAtTB+CgAAAAAAKIJODQAgG+OnKInOIgAAgPyEGgBANorEAAAAQC2EGgAAAEBWujcpiRtzAPISagAAAABZKRIDANUSagAA2bgrk5IouAEAAOQn1AAAslEkBgAAAGrRJfcCAAAAAAAAqqFTAwDIxvgpSqKzCGDVcU1ASVwTAOQl1AAAsvEHIQAQ4ZoAAKie8VMAAAAAAEARhBoAAAAAAEARjJ8CAAAAsrKnBiUxLg0gL6EGAAAAkJUiMQBQLeOnAAAAAACAIgg1AAAAAACAIgg1AAAAAACAIgg1AAAAAACAIgg1AAAAAACAInTLvQAAoPNqaWvNvQSoWnNDU+4lAAAAdHpCDQAgG0ViAAAAoBbGTwEAAAAAAEXQqQEAZGP8FCXRWQQAAJCfUAMAyEaRGAAAAKiF8VMAAAAAAEARdGoAANkYP0VJdBYBAADkJ9QAALJRJAYAAABqYfwUAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBHtqAADZ2CicktgDBgAAID+hBgCQjSIxAAAAUAvjpwAAAAAAgCLo1AAAsjF+ipLoLAIAAMhPqAEAZKNIDAAAANTC+CkAAAAAAKAIOjUAgGyMn6IkOosAAADy06kBAAAAAAAUQagBAAAAAAAUwfgpACAb43wAAACAWujUAAAAAAAAiqBTAwDIxkbhlERnEQAAQH46NQAAAAAAgCLo1AAAsnHnOwAAAFALnRoAAAAAAEARdGoAANnYU4OS6CwCAADIT6gBAGSjSAwAAADUwvgpAAAAAACgCEINAAAAAACgCEINAAAAAACgCEINAAAAAACgCDYKBwCyaWlrzb0EqJqN7QFWHdcElMQ1AUBeQg0AIBt/EAIAEa4JAIDqGT8FAAAAAAAUQacGAAAAkJXxU5REZxFAXkINAAAAICtFYgCgWsZPAQAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAAReiWewEAQOfV0taaewlQteaGptxLAAAA6PSEGgBANorEAAAAQC2MnwIAAAAAAIqgUwMAAADIykhKSqLbGCAvoQYAAACQlSIxAFAtoQYAAACQlU4NSiKEA8hLqAEAAABkpUgMAFRLqAEAZOOuTEqi4AYAAJCfUAMAyEaRGAAAAKhFl9wLAAAAAAAAqIZODQAgG+OnKInOIgAAgPx0agAAAAAAAEXQqQEAZOPOdwAAAKAWOjUAAAAAAIAiCDUAAAAAAIAiCDUAAAAAAIAiCDUAAAAAAIAiCDUAAAAAAIAidMu9AACg82ppa829BKhac0NT7iUAAAB0ejo1AAAAAACAIgg1AAAAAACAIgg1AAAAAACAIthTAwDIxh4FAAAAQC10agAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEWwUTgAkE1LW2vuJUDVbGwPAACQn1ADAMhGkRgAAACohVADAMhGpwYlEcIBAADkJ9QAALJRJAYAAABqIdQAALLRqUFJhHAAAAD5CTUAgGwUiQEAAIBadMm9AAAAAAAAgGoINQAAAAAAgCIINQAAAAAAgCIINQAAAAAAgCLYKBwAyKalrTX3EqBqNrYHAADIT6gBAGSjSAwAAADUwvgpAAAAAACgCDo1AIBsjJ+iJDqLAAAA8hNqAADZKBIDAAAAtTB+CgAAAAAAKIJQAwAAAAAAKILxUwBANvbUoCTGpQEAAOQn1AAAslEkBgAAAGoh1AAAstGpQUmEcAAAAPkJNQCAbBSJAQAAgFrYKBwAAAAAACiCUAMAAAAAACiC8VMAQDb21KAkxqUBAADkJ9QAALJRJAYAAABqYfwUAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQhG65FwAAdF4tba25lwBVa25oyr0EAACATk+oAQBko0gMAAAA1ML4KQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAjdci8AAOi8Wtpacy8Bqtbc0JR7CQAAAJ2eUAMAyEaRGAAAAKiFUAMAyEanBiURwgEAAOQn1AAAslEkBgAAAGpho3AAAAAAAKAIOjUAAACArIykpCS6jQHyEmoAAAAAWSkSAwDVMn4KAAAAAAAoglADAAAAAAAogvFTAEA25mdTEqNRAAAA8tOpAQAAAAAAFEGoAQAAAAAAFMH4KQAgG+N8AAAAgFro1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIrQLfcCAIDOq6WtNfcSoGrNDU25lwAAANDp6dQAAAAAAACKoFMDAMjGne8AAABALXRqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARbCnBgCQTUtba+4lQNXsAQMAAJCfUAMAyEaRGAAAAKiFUAMAyEanBiURwgEAAOQn1AAAslEkBgAAAGpho3AAAAAAAKAIOjUAAACArIykpCS6jQHyEmoAAAAAWSkSAwDVMn4KAAAAAAAoglADAAAAAAAoglADAAAAAAAogj01AIBsbApKScx7BwAAyE+oAQBko0gMAES40YGyuIYFyEuoAQAAAGSlSAwAVMueGgAAAAAAQBGEGgAAAAAAQBGEGgAAAAAAQBGEGgAAAAAAQBGEGgAAAAAAQBGEGgAAAAAAQBG65V4AANB5tbS15l4CVK25oSn3EgAAADo9oQYAkI0iMQAAAFAL46cAAAAAAIAiCDUAAAAAAIAiCDUAAAAAAIAiCDUAAAAAAIAi2CgcAAAAyKqlrTX3EqBqzQ1NuZcA0KkJNQAAAICsFIkBgGoJNQCAbNyVSUkU3AAAAPITagAA2SgSAwAAALUQagAA2ejUoCRCOAAAgPyEGgBANorEAAAAQC265F4AAAAAAABANYQaAAAAAABAEYQaAAAAAABAEeypAQBkY6NwSmIPGAAAgPyEGgBANorEAAAAQC2MnwIAAAAAAIqgUwMAyMb4KUqiswgAACA/oQYAkI0iMQAAAFAL46cAAAAAAIAi6NQAALIxfoqS6CwCAADIT6gBAGSjSAwAAADUwvgpAAAAAACgCEINAAAAAACgCEINAAAAAACgCPbUAACysVE4JbEHDAAAQH46NQAAAAAAgCIINQAAAAAAgCIYPwUAZGOcDwAAAFALnRoAAAAAAEARhBoAAAAAAEARhBoAAAAAAEARhBoAAAAAAEARbBQOAGTT0taaewlQNRvbAwAA5CfUAACyUSQGAAAAaiHUAACy0alBSYRwAAAA+Qk1AIBsFIkBgAg3OlAW17AAedkoHAAAAAAAKIJODQAAACArd74DANXSqQEAAAAAABRBqAEAAAAAABRBqAEAAAAAABRBqAEAAAAAABRBqAEAAAAAABShW+4FAACdV0tba+4lQNWaG5pyLwEAAKDT06kBAAAAAAAUQacGAJCNO98BAACAWujUAAAAAAAAiiDUAAAAAAAAimD8FACQjY3CKYlxaQAAAPnp1AAAAAAAAIqgUwMAyMad7wBAhO5NyuIaFiAvoQYAAACQlSIxAFAt46cAAAAAAIAi6NQAALIxaoKSuIsYAAAgP6EGAJCNIjEAAABQC6EGAJCNTg1KIoQDAADIz54aAAAAAABAEYQaAAAAAABAEYyfAgCyMc4HAAAAqIVODQAAAAAAoAg6NQCAbGwUTkl0FgEAAOSnUwMAAAAAACiCTg0AIBt3vgMAAAC10KkBAAAAAAAUQagBAAAAAAAUwfgpACAbG4VTEuPSAAAA8hNqAADZKBIDAAAAtTB+CgAAAAAAKIJODQAgG+OnKInOIgAAgPyEGgBANorEAAAAQC2EGgBANjo1KIkQDgAAID97agAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEXolnsBAEDn1dzQlHsJAAAAQEGEGgAAAEBWLW2tuZcAVXNjDkBeQg0AAAAgK0ViAKBa9tQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACK0C33AgCAzqulrTX3EqBqzQ1NuZcAAADQ6Qk1AIBsFIkBAACAWhg/BQAAAAAAFEGnBgCQjfFTlERnEQAAQH5CDQAgG0ViAAAAoBbGTwEAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEXolnsBAEDn1dLWmnsJULXmhqbcSwAAAOj0hBoAQDaKxAAAAEAtjJ8CAAAAAACKoFMDAMjG+ClKorMIAAAgP6EGAJCNIjEAAABQC+OnAAAAAACAIgg1AAAAAACAIgg1AAAAAACAIgg1AAAAAACAItgoHADIpqWtNfcSoGo2tgcAAMhPqAEAZKNIDAAAANTC+CkAAAAAAKAIOjUAAACArIykpCS6jQHyEmoAAAAAWSkSAwDVEmoAANm4K5OSKLgBAADkJ9QAALJRJAYAAABqYaNwAAAAAACgCDo1AIBsjJ+iJDqLAAAA8tOpAQAAAAAAFEGoAQAAAAAAFMH4KQAgG+N8AAAAgFro1AAAAAAAAIqgUwMAyMZG4ZREZxEAAEB+Qg0AIBtFYgAAAKAWQg0AIBudGpRECAcAAJCfUAMAyEaRGAAAAKiFjcIBAAAAAIAi6NQAALIxfoqS6CwCAADIT6gBAGSjSAwAAADUwvgpAAAAAACgCEINAAAAAACgCMZPAQDZ2FODkhiXBgAAkJ9QAwDIRpEYAAAAqIXxUwAAAAAAQBGEGgAAAAAAQBGEGgAAAAAAQBHsqQEAZGOjcEpiDxgAAID8dGoAAAAAAABF0KkBAGTjzncAAACgFjo1AAAAAACAIgg1AAAAAACAIhg/BQBkY6NwSmJcGgAAQH5CDQAgG0ViAAAAoBbGTwEAAAAAAEUQagAAAAAAAEUwfgoAyMaeGpTEuDQAAID8hBoAQDaKxAAAAEAtjJ8CAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACK0C33AgCAzqulrTX3EqBqzQ1NuZcAAADQ6Qk1AIBsFIkBAACAWhg/BQAAAAAAFEGnBgCQjfFTlERnEQAAQH5CDQAgG0ViAAAAoBbGTwEAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEUQagAAAAAAAEXolnsBAEDn1dLWmnsJULXmhqbcSwAAAOj0hBoAQDaKxAAAAEAtjJ8CAAAAAACKINQAAAAAAACKINQAAAAAAACKINQAAAAAAACKYKNwACCblrbW3EuAqtnYHgAAID+dGgAAAAAAQBF0agAA2bjzHQAAAKiFTg0AAAAAAKAIQg0AAAAAAKAIQg0AAAAAAKAI9tQAALJpaWvNvQSomj1gAAAA8hNqAADZKBIDAAAAtRBqAADZ6NSgJEI4AACA/IQaAAAAQFZudKAkbnQAyEuoAQBk4w9CACDCNQEAUL0uuRcAAAAAAABQDZ0aAEA2Rk1QEncRAwAA5CfUAACyUSQGAAAAamH8FAAAAAAAUAShBgAAAAAAUAShBgAAAAAAUAShBgAAAAAAUAQbhQMA2bS0teZeAlTNxvYAAAD5CTUAgGwUiQEAAIBaGD8FAAAAAAAUQacGAJCN8VOURGcRAABAfjo1AAAAAACAIujUAACycec7ABChe5OyuIYFyEuoAQAAAGSlSAwAVMv4KQAAAAAAoAg6NQCAbIyaoCTuIgYAAMhPpwYAAAAAAFAEnRoAQDbufAcAAABqoVMDAAAAAAAogk4NACAbe2pQEp1FAAAA+Qk1AIBsFIkBAACAWhg/BQAAAAAAFEGnBgCQjfFTlERnEQAAQH5CDQAgG0ViAAAAoBbGTwEAAAAAAEXQqQEAZGP8FCXRWQQAAJCfTg0AAAAAAKAIOjUAgGzc+Q4AAADUQqcGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBKEGAAAAAABQBBuFAwDZtLS15l4CVM3G9gAAAPkJNQCAbBSJAQAAgFoINQCAbHRqUBIhHAAAQH5CDQAgG0ViAAAAoBY2CgcAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIog1AAAAAAAAIrQLfcCAIDOq6WtNfcSoGrNDU25lwAAANDp6dQAAAAAAACKoFMDAMjGne8AAABALXRqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAARRBqAAAAAAAAReiWewEAAABA59bS1pp7CVC15oam3EsA6NSEGgAAAEBWisQAQLWEGgBANu7KpCQKbgAAAPkJNQCAbBSJAQAAgFoINQCAbHRqUBIhHAAAQH51KaWUexGsHG+//Xacc845ccopp8Saa66ZezmwXI5XSuJ4pSSOV0rjmKUkjldK4nilJI5XSuJ4zU+osRr5xz/+EWuttVb83//9X/Tt2zf3cmC5HK+UxPFKSRyvlMYxS0kcr5TE8UpJHK+UxPGaX5fcCwAAAAAAAKiGUAMAAAAAACiCUAMAAAAAACiCUGM1suaaa8YZZ5xhgxqK4HilJI5XSuJ4pTSOWUrieKUkjldK4nilJI7X/GwUDgAAAAAAFEGnBgAAAAAAUAShBgAAAAAAUAShBgAAAAAAUAShBgAAAAAAUAShxmrgvvvui7q6unj99ddzLwU69P3vfz+22mqrytujR4+O/fbbb7mfs8suu8Txxx+/StfFR1tdXV387ne/W+bHSzgPdvQ9wIpY2rH/u9/9LjbeeOPo2rWrcydZfViv37WeX4cMGRI/+clPVtl6KNOcOXOirq4uJk+enHsp7ZRwjdOZrMjfMh35qP8bX3PNNbH22mvnXgariWpegzt6XV/8fP1R/x1i9VPNMbf4uXNVvH50dkKNVWBZB6YTLauDV155JY488sgYPHhwrLnmmjFw4MBobm6OiRMnrtDzXXzxxXHNNdes3EVSnBdffDGOPfbYGDp0aKy55poxaNCg2HvvveOee+7JvbSV5oUXXog999wz9zL4iFnZ59SIiG9+85vxpS99KZ5//vk466yzqvoc4fHqbXU5xy7+x+BCtZ5fJ02aFN/4xjcqbwudV73nn38+vv71r0dDQ0N07949Ghsb49vf/nbMmzcv99IqBg0aFC+88EJ8/OMfz70UPiR1dXXL/e/73//+Ep9z4oknFnfurIXQl4VW5PdjZan1dX377bePF154IdZaa62IEMR1NqNHj64cl927d4+NN944zjzzzHj33Xc7/NxVeawccMABMWPGjGV+fPFamL/Hatct9wKo3oIFC6J79+65l0En98UvfjEWLFgQv/zlL2Po0KHx0ksvxT333LPCf5QuvPCg85ozZ07ssMMOsfbaa8f5558fI0aMiHfeeSdaWlri6KOPjmnTpn0o61jV59iBAweusuemXCv7nDp//vx4+eWXo7m5ORoaGlbyainRR+UcuyrVen7t37//KloJS/Pss8/GdtttF5tssklMmDAhPvaxj8XTTz8dJ510Utxxxx3x8MMPR79+/Zb6uR/m3z9du3b9yL1Wv/POO7mXsFp74YUXKv9/ww03xOmnnx7Tp0+vvK++vr7y/ymleO+996K+vr7d+2F1VcvvRzVqOZ/Xei7u3r37R+78zYdrjz32iPHjx8fbb78dt99+exx99NGxxhprxCmnnJJtTT179oyePXsu8+NqYR+cTo1M5s2bFwceeGBsuOGG0atXrxgxYkRMmDCh3WN22WWXOOaYY+L444+P9dZbL5qbmyMi4vbbb49NNtkkevbsGbvuumvMmTOn3ectTBpbWlpi+PDhUV9fH3vssUe7F6WIiKuuuiqGDx8ePXr0iM022ywuu+yyyscWLFgQxxxzTGywwQbRo0ePaGxsjHPOOSci/n1B9/3vf79yV2lDQ0Mcd9xxq+CnxEfN66+/Hg888ED8+Mc/jl133TUaGxvjE5/4RJxyyimxzz77RETE3LlzY9999436+vro27dvfPnLX46XXnppmc+5eGfTm2++GYceemjU19fHBhtsEBdeeOESn3PZZZfFsGHDokePHjFgwID40pe+tNK/Vz48Rx11VNTV1cWjjz4aX/ziF2OTTTaJLbbYIr773e/Gww8/XHncq6++Gp///OejV69eMWzYsLj11luX+7w33XRTbLHFFrHmmmvGkCFDljiWhgwZEmeddVYceuih0bdv38pduyeffHJssskm0atXrxg6dGicdtpp7YoKC+8Uvvrqq2Pw4MFRX18fRx11VLz33ntx3nnnxcCBA2P99dePH/3oR+2+3qJ3Ai9smb755ptj1113jV69ekVTU1M89NBD7T7nz3/+c+y4447Rs2fPGDRoUBx33HHx5ptvVj7ud6FsHZ1TlzYK5fXXX4+6urq47777lni+++67L/r06RMREbvttlvlcR1dc4wePTr+9Kc/xcUXX1y5y2nhtcWUKVNizz33jPr6+hgwYEAccsgh8eqrr67KHwsrWTXn2I5euxee96699toYMmRIrLXWWvGVr3wl3njjjcpjqnn9XlpHxNprr93uLrW//e1vceCBB0a/fv2id+/ese2228YjjzwS11xzTfzgBz+I1tbWynG68PMWfd7tt98+Tj755HZf45VXXok11lgj7r///ohofyfykCFDIiLi85//fNTV1cWQIUNizpw50aVLl3jsscfaPc9PfvKTaGxsjPfff7+qnz3/dvTRR0f37t3jzjvvjJ133jkGDx4ce+65Z9x9993x97//Pb73ve9VHrus1+Yrr7wyBg0aFL169YrPf/7zMXbs2HZ3Vs6aNSv23XffGDBgQNTX18d//ud/xt13391uHUOGDImzzz47vv71r0efPn1i8ODB8fOf/7zy8aWdc59++un43Oc+F3379o0+ffrEjjvuGLNmzVrm97q8x0+aNClGjRoV6623Xqy11lqx8847x+OPP97u8+vq6uLyyy+PffbZJ3r37t3uWmLixImx5ZZbRo8ePeJTn/pUTJkypd3nVnPds7zvvzMaOHBg5b+11lor6urqKm9PmzYt+vTpE3fccUdss802seaaa8af//znZXaMXXDBBbHBBhvEuuuuG0cffXS7a8drr702tt122+jTp08MHDgwDjrooHj55ZeXu7Zq/j1/+MMfVs67jY2Nceutt8Yrr7xSOZ9vueWWS5zHlndtucsuu8Rzzz0X3/nOdyrn2UUtr75Q7fF91VVXLfd6vqPrjt/+9rcxYsSI6NmzZ6y77rqx++67t7s2ZuVZ3u/HFVdcEZ/+9KfbPf4nP/lJ5TU14v//O/9HP/pRNDQ0xKabblr52BtvvBEHHnhg9O7dOzbccMMYN25cu+da/Hrh0Ucfja233jp69OgR2267bTzxxBPtHr/oVJT77rsvvva1r8X//d//tesqOfPMM5faibfVVlvFaaed9gF+UnwULOx4b2xsjCOPPDJ23333uPXWW+O1116LQw89NNZZZ53o1atX7LnnnjFz5syIiGUeKxHVn7eX99rcURfIorWwpf09Nnv27Nh4443jggsuaPd5kydPjrq6unjmmWc+2A9tdZBY6Q477LC07777LvH+e++9N0VEeu2119Lf/va3dP7556cnnngizZo1K11yySWpa9eu6ZFHHqk8fuedd0719fXppJNOStOmTUvTpk1Lc+fOTWuuuWb67ne/m6ZNm5Z+9atfpQEDBlSeN6WUxo8fn9ZYY420++67p0mTJqW//OUvafjw4emggw6qPPevfvWrtMEGG6SbbropPfvss+mmm25K/fr1S9dcc01KKaXzzz8/DRo0KN1///1pzpw56YEHHkjXXXddSimlG2+8MfXt2zfdfvvt6bnnnkuPPPJI+vnPf77qfqB8ZLzzzjupvr4+HX/88elf//rXEh9/77330lZbbZU+/elPp8ceeyw9/PDDaZtttkk777xz5TFnnHFGampqqry9+O/LkUcemQYPHpzuvvvu9OSTT6bPfe5zqU+fPunb3/52SimlSZMmpa5du6brrrsuzZkzJz3++OPp4osvXkXfMavavHnzUl1dXTr77LOX+7iISP/xH/+RrrvuujRz5sx03HHHpfr6+jRv3ryUUvvza0opPfbYY6lLly7pzDPPTNOnT0/jx49PPXv2TOPHj688Z2NjY+rbt2+64IIL0jPPPJOeeeaZlFJKZ511Vpo4cWKaPXt2uvXWW9OAAQPSj3/848rnnXHGGam+vj596UtfSk8//XS69dZbU/fu3VNzc3M69thj07Rp09LVV1+dIiI9/PDD7b6HW265JaWU0uzZs1NEpM022yz9/ve/T9OnT09f+tKXUmNjY3rnnXdSSik988wzqXfv3umiiy5KM2bMSBMnTkxbb711Gj16dErJ78LqoKNz6sLj5Iknnqi877XXXksRke69996UUvtj/+23307Tp09PEZFuuumm9MILL6S33367w2uO119/PW233XZpzJgx6YUXXkgvvPBCevfdd9Nrr72W+vfvn0455ZQ0derU9Pjjj6dRo0alXXfd9cP48bASVHOOrfa1u76+Pn3hC19ITz31VLr//vvTwIED0//7f/+v8piOXr9Tan8eXGittdaqnJvfeOONNHTo0LTjjjumBx54IM2cOTPdcMMN6cEHH0z//Oc/0wknnJC22GKLynH6z3/+c4nnvfTSS9PgwYPT+++/X/kaP/3pT9u9r7GxMV100UUppZRefvnlFBFp/Pjx6YUXXkgvv/xySimlUaNGpaOOOqrdWrfccst0+umnd/hz5//X0TE4ZsyYtM4667T7t1n8tfnPf/5z6tKlSzr//PPT9OnT07hx41K/fv3SWmutVXmeyZMnpyuuuCI99dRTacaMGenUU09NPXr0SM8991zlMY2Njalfv35p3LhxaebMmemcc85JXbp0SdOmTUspLXnO/dvf/pb69euXvvCFL6RJkyal6dOnp6uvvrry+MV19Ph77rknXXvttWnq1Knpr3/9azr88MPTgAED0j/+8Y/Kc0REWn/99dPVV1+dZs2alZ577rnKeX748OHpzjvvrPx+DRkyJC1YsCClVP11z/K+/85u/Pjx7Y6phT/3LbfcMt15553pmWeeSfPmzVvq3zJ9+/ZN3/rWt9LUqVPTbbfdlnr16tXu7+Nf/OIX6fbbb0+zZs1KDz30UNpuu+3SnnvuucTXqvU6tl+/fumKK65IM2bMSEceeWTq27dv2mOPPdJvfvObNH369LTffvul4cOHV36/Orq2nDdvXvqP//iPdOaZZ1bOswt/Nh3VF6o9vpd3Pd/RdUdbW1vq1q1bGjt2bJo9e3Z68skn07hx49Ibb7zxAf7lqcbivx+L/x6klNJFF12UGhsbK28fdthhqb6+Ph1yyCFpypQpacqUKSmlfx+7ffr0Seecc06aPn165br0zjvvrHzuoq/rb7zxRurfv3866KCD0pQpU9Jtt92Whg4d2u58vfj18E9+8pPUt2/fynH8xhtvpOeffz516dIlPfroo5Wv8/jjj6e6uro0a9aslfrz4sO1tBrsPvvsk0aOHJn22WefNHz48HT//fenyZMnp+bm5rTxxhunBQsWLPNYSan68/byXps7+r1ZdN3L+nvsRz/6Udp8883bfW/HHXdc2mmnnVbeD7BgQo1V4LDDDktdu3ZNvXv3bvdfjx492l2sLO6zn/1sOuGEEypv77zzzmnrrbdu95hTTjlliQP65JNPXiLUiIhKgS6llMaNG5cGDBhQeXujjTaqhBQLnXXWWWm77bZLKaV07LHHpt12263dH4QLXXjhhWmTTTap/KLSufz2t79N66yzTurRo0fafvvt0ymnnJJaW1tTSindeeedqWvXrmnu3LmVxz/99NMpIioXD8s7kb/xxhupe/fu6Te/+U3l4/PmzUs9e/asFEVuuumm1Ldv33YXyJTrkUceSRGRbr755uU+LiLSqaeeWnl7/vz5KSLSHXfckVJa8o/Bgw46KI0aNardc5x00kntzp+NjY1pv/3263CN559/ftpmm20qb59xxhmpV69e7Y7B5ubmNGTIkPTee+9V3rfpppumc845p933sHiocdVVV1U+vvB3ZerUqSmllA4//PD0jW98o91aHnjggdSlS5f01ltv+V1YTSzvnFprqLG0jy/L0q45Fi0+p/Tv64LPfOYz7d73/PPPp4hI06dPX6Hvlw9XNefYal+7Fz/vnXTSSemTn/xkSqm61++UOg41fvazn6U+ffpUClyLW1oRZfHnffnll1O3bt3S/fffX/n4dtttl04++eTK24uGGsta1w033JDWWWedSuD4l7/8JdXV1aXZs2cvdW0s3cMPP7zUn+9CY8eOTRGRXnrppZTS0l+bDzjggPTZz3623fsOPvjgdoWCpdliiy3ST3/608rbjY2N6atf/Wrl7ffffz+tv/766fLLL08pLXnOPeWUU9LHPvaxqv/mqfXx7733XurTp0+67bbbKu+LiHT88ce3e9zC8/z1119fed/C368bbrghpVT9dc/yvv/Oblmhxu9+97t2j1va3zKNjY3p3Xffrbxv//33TwcccMAyv9akSZNSRFSKZyt6Hbvov+cLL7yQIiKddtpplfc99NBDKSIq4URH15YLn3fR8+PCn01H9YXFLev4Xt71fEfXHX/5y19SRKQ5c+Ys8+uyaqxoqDFgwID09ttvt3tcY2Nj2mOPPdq974ADDmhXMF70deNnP/tZWnfddSvHaEopXX755csMNZa23oX23HPPdOSRR1bePvbYY9Muu+zSwXfPR92iNaX3338/3XXXXWnNNddM++23X4qINHHixMpjX3311dSzZ8/KNeuyjpXFLeu8vbzX5lpCjZSW/vfY3//+93Y3oy1YsCCtt956lRvSOzvjp1aRXXfdNSZPntzuv6uuuqry8ffeey/OOuusGDFiRPTr1y/q6+ujpaUl5s6d2+55ttlmm3ZvT506NT75yU+2e9922223xNfv1atXbLTRRpW3N9hgg0qr1JtvvhmzZs2Kww8/vDITtL6+Pn74wx9W2qNHjx4dkydPjk033TSOO+64uPPOOyvPtf/++8dbb70VQ4cOjTFjxsQtt9xS1QY8rB6++MUvRltbW9x6662xxx57xH333RcjR46Ma665JqZOnRqDBg2KQYMGVR6/+eabx9prrx1Tp07t8LlnzZoVCxYsaHeM9+vXr12r6qhRo6KxsTGGDh0ahxxySPz617+Of/7znyv3m+RDk1Kq+rFbbrll5f979+4dffv2XWbr/tSpU2OHHXZo974ddtghZs6cGe+9917lfdtuu+0Sn3vDDTfEDjvsEAMHDoz6+vo49dRTlzg3DxkypDLmJyJiwIABsfnmm0eXLl3ava+j0QKLfk8bbLBBRETlc1pbW+Oaa65pd55ubm6O999/P2bPnu13YTWxvHPqylLtNcfiWltb49577213DG622WYREcsdv8JHRzXn2Gpfuxc/7y16bVnN63c1Jk+eHFtvvfUy91eoRv/+/eMzn/lM/PrXv46IiNmzZ8dDDz0UBx98cE3Ps99++0XXrl3jlltuiYh/jxDYdddd243WoHq1vN4v/to8ffr0+MQnPtHufYu/PX/+/DjxxBNj+PDhsfbaa0d9fX1MnTp1ifPcoq+7C0epLOu1evLkybHjjjvGGmusUdW6O3r8Sy+9FGPGjIlhw4bFWmutFX379o358+cvscalXZtEtP+bb+Hv18Lf0Wqve2r5/vm3Zf17LGqLLbaIrl27Vt5e9PwYEfGXv/wl9t577xg8eHD06dMndt5554iIZb4Or8i/54ABAyIiYsSIEUu8r9pry+VZXn0hovrje3nX8x1ddzQ1NcV//dd/xYgRI2L//fePK6+8Ml577bXlrpu8RowYsdR9NBavYW233XbLrBdMnTq1Mt5nWZ9frTFjxsSECRPiX//6VyxYsCCuu+66+PrXv75Cz8VHy+9///uor6+PHj16xJ577hkHHHBAjB49Orp169bu+nTddddt9/q5LNWet5f32rwyNDQ0xGc/+9m4+uqrIyLitttui7fffjv233//lfY1SibUWEV69+4dG2+8cbv/Ntxww8rHzz///Lj44ovj5JNPjnvvvTcmT54czc3NsWDBgiWeZ0UsfjFdV1dX+WNi/vz5EfHv2bSLhi5TpkypzFYeOXJkzJ49O84666x466234stf/nJlVvugQYNi+vTpcdlll0XPnj3jqKOOip122slGdp1Ijx49YtSoUXHaaafFgw8+GKNHj44zzjjjQ/naffr0iccffzwmTJgQG2ywQZx++unR1NQUr7/++ofy9Vm5hg0bFnV1dVVtVLu089oHnWu++Dl2YeFrr732it///vfxxBNPxPe+970lzs1LW8uKrG/Rz1k4t3jh58yfPz+++c1vtjtPt7a2xsyZM2OjjTbyu7AaWdY5dWFItmgxcEVea6u95ljc/PnzY++9917iJo2ZM2fGTjvtVPM6+PDVco7tyMo4By96PbrQosf08jZTrMXBBx8cv/3tb+Odd96J6667LkaMGNGu0FeN7t27x6GHHhrjx49X+PgANt5446irq1tusWqdddZpt3n7ivz9c+KJJ8Ytt9wSZ599djzwwAMxefLkGDFiRFWv38s6jms9Hjt6/GGHHRaTJ0+Oiy++OB588MGYPHlyrLvuuivt779qrIprqdVdNf8ey/u5vvnmm9Hc3Bx9+/aNX//61zFp0qRKWNrR63AtX3fhdeQHubas9Xtc9Hxe7fG9vJ9VR9cdXbt2jbvuuivuuOOO2HzzzeOnP/1pbLrpph0GMqx8Xbp0We7r+UKr8ny2Ivbee+9Yc80145Zbbonbbrst3nnnHXsSriYW3lg+c+bMeOutt+KXv/zlEvsCVWtVnrdXxBFHHBHXX399vPXWWzF+/Pg44IADolevXh/6Oj6KhBqZTJw4Mfbdd9/46le/Gk1NTTF06NCYMWNGh583fPjwePTRR9u9b9GNdKsxYMCAaGhoiGeffXaJ4OVjH/tY5XF9+/aNAw44IK688sq44YYb4qabbor//d//jYh/X7Tvvffecckll8R9990XDz30UDz11FM1rYPVx+abbx5vvvlmDB8+PJ5//vl4/vnnKx/761//Gq+//npsvvnmHT7PRhttFGussUY88sgjlfe99tprS/xudOvWLXbfffc477zz4sknn4w5c+bEH//4x5X3DfGh6devXzQ3N8e4ceOWusnfihbohw8fHhMnTmz3vokTJ8Ymm2zS7k66xT344IPR2NgY3/ve92LbbbeNYcOGxXPPPbdCa/igRo4cGX/961+XOE9vvPHGlTue/C6snhaeUxcW+RbdiHPRDWyrVc01R/fu3dvd/Rnx72Pw6aefjiFDhixxDH7U/khl6ao5x37Q1+6I6l+/+/fv3+54njlzZrsOsy233DImT55cud5c3NKO06XZd99941//+lf84Q9/iOuuu67DLo011lhjqc97xBFHxN133x2XXXZZvPvuu/GFL3yhw69Ne+uuu26MGjUqLrvssnjrrbfafezFF1+MX//613HAAQcst/Cw6aabxqRJk9q9b/G3J06cGKNHj47Pf/7zMWLEiBg4cGDMmTPnA619yy23jAceeKDqMLmjx0+cODGOO+642GuvvSobQC+6AXJHFv2bb+Hv1/DhwyNixa97WPWmTZsW8+bNi3PPPTd23HHH2GyzzTrsjllV/57VXFtWe55d3Ac9vheur6Prjrq6uthhhx3iBz/4QTzxxBPRvXv3SrGRD0///v3jxRdfbBds1HKNungN6+GHH66czxY3fPjwePLJJ+Nf//rXMj9/ccs6jrt16xaHHXZYjB8/PsaPHx9f+cpXVtoNFeS18MbywYMHR7du3SLi38fOu+++2+76dN68eTF9+vTKNe7SjpVaztvLe22u1bKO27322it69+4dl19+efzhD39wk80ihBqZDBs2LO6666548MEHY+rUqfHNb34zXnrppQ4/71vf+lbMnDkzTjrppJg+fXpcd911KzSi4gc/+EGcc845cckll8SMGTPiqaeeivHjx8fYsWMjImLs2LExYcKEmDZtWsyYMSNuvPHGGDhwYKy99tpxzTXXxC9+8YuYMmVKPPvss/GrX/0qevbsGY2NjTWvg7LMmzcvdtttt/jVr34VTz75ZMyePTtuvPHGOO+882LfffeN3XffPUaMGBEHH3xwPP744/Hoo4/GoYceGjvvvHNVrdv19fVx+OGHx0knnRR//OMfY8qUKTF69Oh2I31+//vfxyWXXBKTJ0+O5557Lv77v/873n///ZpHXPDRMW7cuHjvvffiE5/4RNx0000xc+bMmDp1alxyySUr3Fp8wgknxD333BNnnXVWzJgxI375y1/GpZdeGieeeOJyP2/YsGExd+7cuP7662PWrFlxySWXZPtD6eSTT44HH3wwjjnmmMpdJ//zP/8TxxxzTET4XVgddHRO7dmzZ3zqU5+Kc889N6ZOnRp/+tOf4tRTT63561RzzTFkyJB45JFHYs6cOfHqq6/G+++/H0cffXT87//+bxx44IExadKkmDVrVrS0tMTXvva1FSp4kEdH59gP+todUd3rd0TEbrvtFpdeemk88cQT8dhjj8W3vvWtdnftHnjggTFw4MDYb7/9YuLEifHss8/GTTfdFA899FBE/Ps4nT17dkyePDleffXVePvtt5e6nt69e8d+++0Xp512WkydOjUOPPDA5a5/yJAhcc8998SLL77YbpTJ8OHD41Of+lScfPLJceCBByp8rKBLL7003n777Whubo77778/nn/++fjDH/4Qo0aNig033DB+9KMfLffzjz322Lj99ttj7NixMXPmzPjZz34Wd9xxR7sgZNiwYXHzzTdX7jw/6KCDPnAHwjHHHBP/+Mc/4itf+Uo89thjMXPmzLj22mtj+vTpK/T4YcOGxbXXXhtTp06NRx55JA4++OCajqkzzzwz7rnnnsrv13rrrRf77bdfRKz4dQ+r3uDBg6N79+7x05/+NJ599tm49dZb46yzzlru56yqf8+Ori0j/n0+vP/+++Pvf/97TaHEBz2+I6LD645HHnkkzj777Hjsscdi7ty5cfPNN8crr7yywgVEVtwuu+wSr7zySpx33nkxa9asGDduXNxxxx1Vf/7EiRPjvPPOixkzZsS4cePixhtvjG9/+9tLfexBBx0UdXV1MWbMmPjrX/8at99+e1xwwQXLff4hQ4bE/Pnz45577olXX3213Q0URxxxRPzxj39UHO4Ehg0bFvvuu2+MGTMm/vznP0dra2t89atfjQ033DD23XffiFj6sVLLeXt5r821WtrfYxERXbt2jdGjR8cpp5wSw4YNW+EayepIqJHJqaeeGiNHjozm5ubYZZddKn/AdWTw4MFx0003xe9+97toamqKK664Is4+++yav/4RRxwRV111VYwfPz5GjBgRO++8c1xzzTWVTo0+ffrEeeedF9tuu23853/+Z8yZMyduv/326NKlS6y99tpx5ZVXxg477BBbbrll3H333XHbbbfFuuuuW/M6KEt9fX188pOfjIsuuih22mmn+PjHPx6nnXZajBkzJi699NKoq6uL//mf/4l11lkndtppp9h9991j6NChccMNN1T9Nc4///zYcccdY++9947dd989Pv3pT7fbW2bttdeOm2++OXbbbbcYPnx4XHHFFTFhwoTYYostVsW3zIdg6NCh8fjjj8euu+4aJ5xwQnz84x+PUaNGxT333BOXX375Cj3nyJEj4ze/+U1cf/318fGPfzxOP/30OPPMM2P06NHL/bx99tknvvOd78QxxxwTW221VTz44INx2mmnrdAaPqgtt9wy/vSnP8WMGTNixx13jK233jpOP/30aGhoiAi/C6uDjs6pERFXX311vPvuu7HNNtvE8ccfHz/84Q9r/jrVXHOceOKJ0bVr19h8882jf//+MXfu3GhoaIiJEyfGe++9F5/5zGdixIgRcfzxx8faa6+9RLGaj66OzrEr47U7ouPX74iICy+8MAYNGhQ77rhjHHTQQXHiiSe2a5/v3r173HnnnbH++uvHXnvtFSNGjIhzzz23cmfyF7/4xdhjjz1i1113jf79+8eECROWuZ6DDz44WltbY8cdd4zBgwcvd+0XXnhh3HXXXTFo0KDYeuut233s8MMPjwULFih8fADDhg2Lxx57LIYOHRpf/vKXY6ONNopvfOMbseuuu8ZDDz3U4R4qO+ywQ1xxxRUxduzYaGpqij/84Q/xne98p9189bFjx8Y666wT22+/fey9997R3NwcI0eO/EDrXnfddeOPf/xjzJ8/P3beeefYZptt4sorr1zmnhkdPf4Xv/hFvPbaazFy5Mg45JBD4rjjjov111+/6vWce+658e1vfzu22WabePHFF+O2226r3F2/otc9rHr9+/ePa665Jm688cbYfPPN49xzz+2wILuq/j07uraM+HeBbs6cObHRRhu1GwvXkQ96fEdEh9cdffv2jfvvvz/22muv2GSTTeLUU0+NCy+8MPbcc8+avg4f3PDhw+Oyyy6LcePGRVNTUzz66KM1hW4nnHBCPPbYY7H11lvHD3/4wxg7dmw0Nzcv9bH19fVx2223xVNPPRVbb711fO9734sf//jHy33+7bffPr71rW/FAQccEP3794/zzjuv8rFhw4bF9ttvH5ttttkS+9Wy+hk/fnxss8028bnPfS622267SCnF7bffXnltXtqxUst5e3mvzbVa2t9jCy28Hv3a1762Qs+9uqpLtezaBgAA0EmcddZZceONN8aTTz6ZeyksYsyYMTFt2rR44IEHci8FgIKklGLYsGFx1FFHxXe/+93cy4GqPPDAA/Ff//Vf8fzzz8eAAQNyL+cjo1vuBQAAAHyUzJ8/P+bMmROXXnrpCnVIsXJdcMEFMWrUqOjdu3fccccd8ctf/jIuu+yy3MsCoCCvvPJKXH/99fHiiy+6450ivP322/HKK6/E97///dh///0FGosRagAAACzimGOOiQkTJsR+++1n9NRHwKOPPhrnnXdevPHGGzF06NC45JJL4ogjjsi9LAAKsv7668d6660XP//5z2OdddbJvRzo0IQJE+Lwww+PrbbaKv77v/8793I+coyfAgAAAAAAimCXRwAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAhCDQAAAAAAoAj/H8Jnd8M2j1siAAAAAElFTkSuQmCC",
|
||
"text/plain": [
|
||
"<Figure size 2000x1200 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"import matplotlib.pyplot as plt # Загружаем модуль matplotlib.pyplot\n",
|
||
"import seaborn as sns # Загружаем модуль seaborn\n",
|
||
"%matplotlib inline\n",
|
||
"\n",
|
||
"fig, ax = plt.subplots(figsize=(20,12)) # Создаем область под график\n",
|
||
"sns_heatmap = sns.heatmap(data.isnull(), yticklabels=False, cbar=False, cmap='viridis') # Визуализируем прпуски\n",
|
||
"plt.show() # Отображаем график"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 495
|
||
},
|
||
"id": "uZgh1E3nhkx6",
|
||
"outputId": "3a6709c7-3fbc-4260-bcd9-69c47228b013"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/html": [
|
||
"<div>\n",
|
||
"<style scoped>\n",
|
||
" .dataframe tbody tr th:only-of-type {\n",
|
||
" vertical-align: middle;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe tbody tr th {\n",
|
||
" vertical-align: top;\n",
|
||
" }\n",
|
||
"\n",
|
||
" .dataframe thead th {\n",
|
||
" text-align: right;\n",
|
||
" }\n",
|
||
"</style>\n",
|
||
"<table border=\"1\" class=\"dataframe\">\n",
|
||
" <thead>\n",
|
||
" <tr style=\"text-align: right;\">\n",
|
||
" <th></th>\n",
|
||
" <th>Hardness</th>\n",
|
||
" <th>Solids</th>\n",
|
||
" <th>Chloramines</th>\n",
|
||
" <th>Sulfate</th>\n",
|
||
" <th>Conductivity</th>\n",
|
||
" <th>Organic carbon</th>\n",
|
||
" <th>Trihalomethanes</th>\n",
|
||
" <th>Turbidity</th>\n",
|
||
" <th>Potability</th>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>ph</th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" <th></th>\n",
|
||
" </tr>\n",
|
||
" </thead>\n",
|
||
" <tbody>\n",
|
||
" <tr>\n",
|
||
" <th>NaN</th>\n",
|
||
" <td>204.890455</td>\n",
|
||
" <td>20791.318981</td>\n",
|
||
" <td>7.300212</td>\n",
|
||
" <td>368.516441</td>\n",
|
||
" <td>564.308654</td>\n",
|
||
" <td>10.379783</td>\n",
|
||
" <td>86.99097</td>\n",
|
||
" <td>2.963135</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>3.716080</th>\n",
|
||
" <td>129.422921</td>\n",
|
||
" <td>18630.057858</td>\n",
|
||
" <td>6.635246</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>592.885359</td>\n",
|
||
" <td>15.180013</td>\n",
|
||
" <td>56.329076</td>\n",
|
||
" <td>4.500656</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.099124</th>\n",
|
||
" <td>224.236259</td>\n",
|
||
" <td>19909.541732</td>\n",
|
||
" <td>9.275884</td>\n",
|
||
" <td>Python</td>\n",
|
||
" <td>418.606213</td>\n",
|
||
" <td>16.868637</td>\n",
|
||
" <td>66.420093</td>\n",
|
||
" <td>3.055934</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.316766</th>\n",
|
||
" <td>214.373394</td>\n",
|
||
" <td>22018.417441</td>\n",
|
||
" <td>8.059332</td>\n",
|
||
" <td>356.886136</td>\n",
|
||
" <td>363.266516</td>\n",
|
||
" <td>18.436524</td>\n",
|
||
" <td>100.341674</td>\n",
|
||
" <td>4.628771</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>9.092223</th>\n",
|
||
" <td>181.101509</td>\n",
|
||
" <td>17978.986339</td>\n",
|
||
" <td>6.546600</td>\n",
|
||
" <td>310.135738</td>\n",
|
||
" <td>398.410813</td>\n",
|
||
" <td>11.558279</td>\n",
|
||
" <td>31.997993</td>\n",
|
||
" <td>4.075075</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>5.584087</th>\n",
|
||
" <td>188.313324</td>\n",
|
||
" <td>28748.687739</td>\n",
|
||
" <td>7.544869</td>\n",
|
||
" <td>326.678363</td>\n",
|
||
" <td>280.467916</td>\n",
|
||
" <td>8.399735</td>\n",
|
||
" <td>54.917862</td>\n",
|
||
" <td>2.559708</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>10.223862</th>\n",
|
||
" <td>248.071735</td>\n",
|
||
" <td>28749.716544</td>\n",
|
||
" <td>7.513408</td>\n",
|
||
" <td>393.663396</td>\n",
|
||
" <td>283.651634</td>\n",
|
||
" <td>13.789695</td>\n",
|
||
" <td>84.603556</td>\n",
|
||
" <td>2.672989</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>8.635849</th>\n",
|
||
" <td>203.361523</td>\n",
|
||
" <td>13672.091764</td>\n",
|
||
" <td>4.563009</td>\n",
|
||
" <td>303.309771</td>\n",
|
||
" <td>474.607645</td>\n",
|
||
" <td>12.363817</td>\n",
|
||
" <td>62.798309</td>\n",
|
||
" <td>4.401425</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>NaN</th>\n",
|
||
" <td>118.988579</td>\n",
|
||
" <td>14285.583854</td>\n",
|
||
" <td>7.804174</td>\n",
|
||
" <td>268.646941</td>\n",
|
||
" <td>389.375566</td>\n",
|
||
" <td>12.706049</td>\n",
|
||
" <td>53.928846</td>\n",
|
||
" <td>3.595017</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" <tr>\n",
|
||
" <th>11.180284</th>\n",
|
||
" <td>227.231469</td>\n",
|
||
" <td>25484.508491</td>\n",
|
||
" <td>9.077200</td>\n",
|
||
" <td>404.041635</td>\n",
|
||
" <td>563.885481</td>\n",
|
||
" <td>17.927806</td>\n",
|
||
" <td>71.976601</td>\n",
|
||
" <td>4.370562</td>\n",
|
||
" <td>0</td>\n",
|
||
" </tr>\n",
|
||
" </tbody>\n",
|
||
"</table>\n",
|
||
"</div>"
|
||
],
|
||
"text/plain": [
|
||
" Hardness Solids Chloramines Sulfate Conductivity \\\n",
|
||
"ph \n",
|
||
"NaN 204.890455 20791.318981 7.300212 368.516441 564.308654 \n",
|
||
"3.716080 129.422921 18630.057858 6.635246 Python 592.885359 \n",
|
||
"8.099124 224.236259 19909.541732 9.275884 Python 418.606213 \n",
|
||
"8.316766 214.373394 22018.417441 8.059332 356.886136 363.266516 \n",
|
||
"9.092223 181.101509 17978.986339 6.546600 310.135738 398.410813 \n",
|
||
"5.584087 188.313324 28748.687739 7.544869 326.678363 280.467916 \n",
|
||
"10.223862 248.071735 28749.716544 7.513408 393.663396 283.651634 \n",
|
||
"8.635849 203.361523 13672.091764 4.563009 303.309771 474.607645 \n",
|
||
"NaN 118.988579 14285.583854 7.804174 268.646941 389.375566 \n",
|
||
"11.180284 227.231469 25484.508491 9.077200 404.041635 563.885481 \n",
|
||
"\n",
|
||
" Organic carbon Trihalomethanes Turbidity Potability \n",
|
||
"ph \n",
|
||
"NaN 10.379783 86.99097 2.963135 0 \n",
|
||
"3.716080 15.180013 56.329076 4.500656 0 \n",
|
||
"8.099124 16.868637 66.420093 3.055934 0 \n",
|
||
"8.316766 18.436524 100.341674 4.628771 0 \n",
|
||
"9.092223 11.558279 31.997993 4.075075 0 \n",
|
||
"5.584087 8.399735 54.917862 2.559708 0 \n",
|
||
"10.223862 13.789695 84.603556 2.672989 0 \n",
|
||
"8.635849 12.363817 62.798309 4.401425 0 \n",
|
||
"NaN 12.706049 53.928846 3.595017 0 \n",
|
||
"11.180284 17.927806 71.976601 4.370562 0 "
|
||
]
|
||
},
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"data.fillna(\"Python\").head(10) # С помощью метода .fillna() заменяем все пропуски словом Python"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"metadata": {
|
||
"id": "mBsHwML6hkyF"
|
||
},
|
||
"outputs": [
|
||
{
|
||
"ename": "SyntaxError",
|
||
"evalue": "invalid syntax (192570114.py, line 3)",
|
||
"output_type": "error",
|
||
"traceback": [
|
||
"\u001b[0;36m Cell \u001b[0;32mIn[28], line 3\u001b[0;36m\u001b[0m\n\u001b[0;31m Теперь посмотрим, а что содержательно у нас есть на руках.\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"### Описательные статистики\n",
|
||
"\n",
|
||
"Теперь посмотрим, а что содержательно у нас есть на руках. \n",
|
||
"\n",
|
||
"Глазами просматривать не будем, а попросим посчитать основные описательные статистики. Причем сразу все.\n",
|
||
"\n",
|
||
"- describe() - метод, который возвращает табличку с описательными статистиками. В таком виде считает все для числовых столбцов"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 297
|
||
},
|
||
"id": "ZWz60or1hkyG",
|
||
"outputId": "134781c0-28b6-4137-85d1-8376216860c6",
|
||
"scrolled": true
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.describe() # Отобразим описательные статистики нашего датафрейма (только числовые данные)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "0aeamrWMhkyK"
|
||
},
|
||
"source": [
|
||
"Немножко магии, и для нечисловых данные тоже будут свои описательные статистики. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 173
|
||
},
|
||
"id": "jKTF-2BHhkyK",
|
||
"outputId": "244a91a6-e8b4-42a9-d464-3728bc5c3dc5"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.describe(include=['O']) # # Отобразим описательные статистики нашего датафрейма ('O' - в том числе и строковые)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "-IbwBRL_hkyO"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"### Срезы данных\n",
|
||
"\n",
|
||
"Допустим, нам не нужен датасет, а только определенные столбцы или строки или столбцы и строки. \n",
|
||
"\n",
|
||
"\n",
|
||
"Как делать?\n",
|
||
"Помним, что:\n",
|
||
"- у столбцов есть названия\n",
|
||
"- у строк есть названия\n",
|
||
"- если нет названий, то они пронумерованы с нуля\n",
|
||
"\n",
|
||
"Основываясь на этой идее, мы начнем отбирать данные."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 80
|
||
},
|
||
"id": "4uT9dn4vhkyO",
|
||
"outputId": "f57acc81-2f88-41fd-ada0-028d80ed3ca7"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.head(1) # Отобразим первую строчку датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "5b2HPHJwhkyT"
|
||
},
|
||
"source": [
|
||
"#### Отбираем по столбцам. Версия 1. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 221
|
||
},
|
||
"id": "ocn9YgmnhkyZ",
|
||
"outputId": "438dc10b-ff81-42d3-deea-eefa140305d5"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"array = data['price'] # Отобразим столбец price\n",
|
||
"array"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 221
|
||
},
|
||
"id": "tBeyZQLPIMIJ",
|
||
"outputId": "95d6ac17-ddde-46d9-e9a8-6e265eb12085"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.price"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 119
|
||
},
|
||
"id": "YVzV30CQhkyV",
|
||
"outputId": "26839d1c-a250-4ec0-a388-50f50e45af89"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.price.head() # Отобразим столбец price (альтернативные вариант)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "IDhUYDK5hkye",
|
||
"outputId": "536a6bdf-0016-4faf-e984-f1bfa8d356ad"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"new_df = data[['price','country']].head() # Отобразим столбцы 'price' и 'country'\n",
|
||
"new_df"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "Pw7bsVKPhkyg"
|
||
},
|
||
"source": [
|
||
"#### Отбираем по строкам. Версия 1. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 359
|
||
},
|
||
"id": "-3ntG2CzhDyV",
|
||
"outputId": "799530a2-5339-4ddf-8cbc-ef187d8a148f"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data[10:20] # Отобразим с 10й по 20ю строки датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 173
|
||
},
|
||
"id": "DaW5dRU7hkyh",
|
||
"outputId": "9665e0f7-f195-4217-b8a1-674700cdc917"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data[10:20:3] # Отобразим с 10й по 20ю строки датафрейма с шагом 2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 359
|
||
},
|
||
"id": "zXqL-lBEhGkG",
|
||
"outputId": "c741c699-f40d-4417-9bb4-bc849da4f19b"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data[::5].head(10) # Отобразим каждую 5ю строку датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "pV0kczWfhkyk"
|
||
},
|
||
"source": [
|
||
"#### Отбор по столбцам. Версия 2. Все еще по названиям "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 495
|
||
},
|
||
"id": "blyn4oRnJOlm",
|
||
"outputId": "cc0258b0-2735-4d7f-cb92-ac9b23b2a83e"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.head(10)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 173
|
||
},
|
||
"id": "LfRYRSsohkyk",
|
||
"outputId": "c7f8b402-bba9-4da4-eebe-6402c24030c2"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.loc[4:7, ['price', 'points']] # Отобразим два столбца 'price' и 'points', и в них строки с индексами с 4 по 7"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "CK4-ntzDhkyo"
|
||
},
|
||
"source": [
|
||
"#### Отбор по строкам. Версия 2. Все еще по названиям "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 235
|
||
},
|
||
"id": "eqAQs0YIhkyq",
|
||
"outputId": "d0bab15c-91be-41a2-ef6f-82f2faf5e702"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.loc[:5,:] # Отобразим строки с индексом от 0 до 5 (то же, что и data.loc[:5])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "NGugpgJfhkyv"
|
||
},
|
||
"source": [
|
||
"#### Отбор по строчкам и столбцам. Версия 3. По номеру строк и столбцов"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "CG0aSTW8hkyv",
|
||
"outputId": "af2503a1-524e-431c-f61e-5d0a5590a9b1"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.iloc[::5, [1,3]].head() # Отобразим каждую 5 строку и 1 и 3 столбец"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "sIao6149hkyy"
|
||
},
|
||
"source": [
|
||
"#### Отбор с условиями\n",
|
||
"\n",
|
||
"Так, а если мне нужны вина дороже $15 долларов? Как быть?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "YsIrLdRnhkyy"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"#задаем маску\n",
|
||
"mask = data['price'] > 15"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 119
|
||
},
|
||
"id": "nfVB6YBbhky0",
|
||
"outputId": "ebfb750f-f4f2-43a3-c62b-4e39273046de"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"mask.head() # Отобразим маску"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 359
|
||
},
|
||
"id": "FADnit0Ghky2",
|
||
"outputId": "16cf6881-4c3c-408e-f661-4141182143d5"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"#и отбираем данные\n",
|
||
"temp = data[mask] # Выбираем данные из датафрейма в соответствии с маской и записываем их в новый даатафрейм temp\n",
|
||
"temp # Отображаем temp"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "EHT3VYtNhky4",
|
||
"outputId": "84091e0c-6995-4d73-b63a-c69aeec6ecd4"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data[data.price>300].head()# Альтернативный вариант"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 616
|
||
},
|
||
"id": "Moi8GwyVhky8",
|
||
"outputId": "f4020760-204a-42f0-9df3-28242583e16e"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data[(data.price > 200) & ((data.country == 'US') | (data.country == 'France'))].head(15) # Составное условие"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "Da0WfR_5hky_"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"### Мультииндексация"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "gIqtjR45hky_",
|
||
"outputId": "48a1fdf1-4c7f-4c3c-8e76-b9e773cb15c7"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.head() # Отобразим наш датафрем"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 1000
|
||
},
|
||
"id": "JizHrXguhkzC",
|
||
"outputId": "4b793963-14d9-4f87-bb3f-b26bb14d2d8e"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data_ = data.groupby(['country', 'price']).count() # Сграппируем данные сначала по странам, а затем по price\n",
|
||
"data_.head(100) # Отобразим первые 50 строк нового датафрейма"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 450
|
||
},
|
||
"id": "hE9aG1imhkzG",
|
||
"outputId": "b2abe0e9-93f7-4044-e88c-844918452e52"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data_.loc['US'] # Отобразим все данные для 'US'"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 170
|
||
},
|
||
"id": "ZteYvkfehkzL",
|
||
"outputId": "938087ed-492f-4ddf-c3f1-16ca3b38f331"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data_.loc['US', 100] # Отобразим данные для 'US', у кого 100 points"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "6o6JX1OnhkzP"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"#### Как изменять значения в табличке"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "sVulAc0HsaLu",
|
||
"outputId": "6fca9bb4-357b-4398-e509-6191a1ee9e74"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data_backup = data.copy() # Создаем копию нашего датафрейма и записываем в переменную data_backup\n",
|
||
"data.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 297
|
||
},
|
||
"id": "eMhSX4jqhkzP",
|
||
"outputId": "5ee71b23-0935-46c4-925f-912e28eeda25"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.iloc[0,1] = 'kotiki' # Вставляем новое значение в 0 строку и 1 стоблец\n",
|
||
"data.iloc[2,2] = '129' # Вставляем новое значение в 2 строку и 2 стоблец\n",
|
||
"data.iloc[3:5,2:5] = 'new' # Вставляем новое значение с 3 по 5 строку и со 2го по 5ый стоблец\n",
|
||
"data.head(8)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "RZvBCkMCsiOT"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data = data_backup.copy() # Восстанавливаем данные из копии"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "SNliNx3TPCTX",
|
||
"outputId": "2f0ccacf-df6b-4499-c844-12be866957dc"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 204
|
||
},
|
||
"id": "qXlU-wqyhkzT",
|
||
"outputId": "b65e6f8a-0562-43d7-b09a-ff2c8f9eab35"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"data.loc[data.country == 'US', 'region_2'] = 'Syberia'\n",
|
||
"data.loc[data.price > 100, 'points'] = 200\n",
|
||
"data.loc[data.price > 100, 'price'] = 1000\n",
|
||
"data.head()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"id": "6PumqSIU7Q7U"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"## Перевод в Numpy\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 51
|
||
},
|
||
"id": "Obs9TzQ9E8ss",
|
||
"outputId": "7ea94ae7-0ee5-4773-be88-14ab92522349"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"np_data = data.values # Получаем данные из датафрейма и записываем их в переменную np_data\n",
|
||
"print(np_data.shape) # Выводим размерность np_data\n",
|
||
"np_data.dtype"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 105
|
||
},
|
||
"id": "y1n1cNpdrFqQ",
|
||
"outputId": "f6f3c2df-9732-40ed-8569-5d53223b9a24"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"print(np_data[0]) # Выводим 0ой элемент из массива"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": null,
|
||
"metadata": {
|
||
"colab": {
|
||
"base_uri": "https://localhost:8080/",
|
||
"height": 717
|
||
},
|
||
"id": "UsAcU8mBnWwn",
|
||
"outputId": "4eec37c8-c418-4140-de98-c4ec5b6bbe8b"
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# Выведем первые 10 элементов из np_data\n",
|
||
"for i in range(10):\n",
|
||
" print(np_data[i])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"id": "8F6G2AUnKFCA"
|
||
},
|
||
"source": [
|
||
"# **Глоссарий**\n",
|
||
"\n",
|
||
"\n",
|
||
"pd.DataFrame(данные, columns = [колонки, если есть], index = [индексы ,если есть]) - создать датафрейм\n",
|
||
"\n",
|
||
"pd.read_csv(полный адрес расположения файла) - открыть .csv файл\n",
|
||
"\n",
|
||
"------------\n",
|
||
"\n",
|
||
".head() - посмотреть верхушку датафрейма (первые n строк)\n",
|
||
"\n",
|
||
".tail() - посмотреть конец датафрейма (последние n строк)\n",
|
||
"\n",
|
||
".columns - список колонок датафрейма\n",
|
||
"\n",
|
||
".values - вывести массив всех значений датафрейма\n",
|
||
"\n",
|
||
".index - список индексов датафрейма\n",
|
||
"\n",
|
||
".tolist() - перевести в список\n",
|
||
"\n",
|
||
".count() - посчитать количество определенных величин во фрейме\n",
|
||
"\n",
|
||
".describe() - посмотреть основные статистические характеристики фрейма\n",
|
||
"\n",
|
||
".shape - форма фрейма (строки, колонки)\n",
|
||
"\n",
|
||
".size - размер фрейма строки*колонки\n",
|
||
"\n",
|
||
".info() - информация о данных каждой колонки\n",
|
||
"\n",
|
||
".dtypes - тип данных каждой колонки\n",
|
||
"\n",
|
||
".isnull() - где недостает значений\n",
|
||
"\n",
|
||
".isna()- есть ли значения None\n",
|
||
"\n",
|
||
".dropna() - выкинуть строки/колонки с None\n",
|
||
"\n",
|
||
".fillna() - заполнить заданным значеним ячейки, где есть None\n",
|
||
"\n",
|
||
".loc[] - вывести значения по названиям колонок\n",
|
||
"\n",
|
||
".iloc[] - вывести значения по индексам колонок\n",
|
||
"\n",
|
||
".drop() - выкинуть определенные значения\n",
|
||
"\n",
|
||
"--------------\n",
|
||
"\n",
|
||
"pd.to_datetime(колонка, которую переводим в формат временного ряда)\n",
|
||
"\n",
|
||
".groupby() - сгруппировать по конкретному признаку\n",
|
||
"\n",
|
||
".copy() - создать копию\n",
|
||
"\n",
|
||
".sort_values() - сортировка значений\n",
|
||
"\n",
|
||
"pd.concat([df1,df2]) - конкатенация фреймов\n",
|
||
"\n",
|
||
".merge(второй_датафрейм, on = 'общая колонка, по которой склеиваем', how = 'с какой стороны') - конкатенация фреймов через общий признак\n",
|
||
"\n",
|
||
"-------------\n",
|
||
"\n",
|
||
"\n",
|
||
".corr() - вычислить корреляцию\n",
|
||
"\n",
|
||
".median() - вычислить медиану\n",
|
||
"\n",
|
||
".cumsum() - вычислить куммулятивную сумму\n",
|
||
"\n",
|
||
".cumprod() - вычислить коммулятивное произведение\n",
|
||
"\n",
|
||
".cummax() - вычислить коммулятивный максимум\n",
|
||
"\n",
|
||
"-------------\n",
|
||
"\n",
|
||
".quantile([]) - вычислить квантили\n",
|
||
"\n",
|
||
".nunique() - уникальные значения для n-колонок/строк\n",
|
||
"\n",
|
||
".unique() - уникальные значения определенной колонки/строк\n",
|
||
"\n",
|
||
"------------\n",
|
||
"\n",
|
||
".apply(функция) - применить функцию для колонки/строки\n",
|
||
"\n",
|
||
".agg(набор_функций) - применить ряд функций для колонки/строки\n"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"colab": {
|
||
"provenance": [],
|
||
"toc_visible": true
|
||
},
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.13.2"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|