Title | Online Lab 2 Solutions - Big Data |
---|---|
Course | Big Data Fundamentals |
Institution | University of Strathclyde |
Pages | 2 |
File Size | 64.7 KB |
File Type | |
Total Downloads | 39 |
Total Views | 125 |
Big Data...
CS989 / CS982: Big Data Fundamentals / Techniques Lab 2 – Data Handling and Manipulation Part 1 1. Import the necessary libraries (numpy and pandas) import numpy as np import pandas as pd 2. Import the Pokemon dataset available at https://www.kaggle.com/alopez247/pokemon I downloaded the dataset and used this command pokemon = pd.read_csv("C://Work Files//Teaching//Big Data// Datasets//pokemon_alopez247.csv") 3. Print the first 10 entries pokemon.head(10) 4. How many observations and columns are there? pokemon.shape 5. Print the names of all of the columns pokemon.columns
Part 2 1. Import the necessary libraries (numpy and pandas) import numpy as np import pandas as pd 2. Import the Pokemon dataset available at https://www.kaggle.com/alopez247/pokemon I downloaded the dataset and used this command pokemon = pd.read_csv("C://Work Files//Teaching//Big Data// Datasets//pokemon_alopez247.csv") 3. Sort by attack power from high to low pokemon.Attack.sort_values(ascending=False) 4. Describe the defence power for each Pokemon type
pokemon.groupby('Type_1').Defense.describe() 5. What are the mean, median, max and minimum of the total column for each Pokemon type? pokemon.groupby('Type_1').Total.agg(["mean", "min", "max", "median"]) 6. What is the most common Pokemon type? p = pokemon.groupby("Type_1").size().reset_index(name='counts') p = p.sort_values(['counts'], ascending=False) p.head(1) or pokemon["Type_1"].value_counts().idxmax()
Part 3 1. Import the necessary libraries (numpy and pandas) import numpy as np import pandas as pd 2. Import the Pokemon dataset available at https://www.kaggle.com/alopez247/pokemon I downloaded the dataset and used this command pokemon = pd.read_csv("C://Work Files//Teaching//Big Data// Datasets//pokemon_alopez247.csv") 3. Check if there are missing values pokemon.isnull().any() or pokemon.isnull().sum() 4. Remove the Type_2 and Egg_Group_2 columns pokemon = pokemon.drop("Type_2", axis=1) pokemon = pokemon.drop("Egg_Group_2", axis=1) pokemon 5. Drop any rows that have null values pokemon = pokemon.dropna() pokemon...