Online Lab 2 Solutions - Big Data PDF

Title Online Lab 2 Solutions - Big Data
Course Big Data Fundamentals
Institution University of Strathclyde
Pages 2
File Size 64.7 KB
File Type PDF
Total Downloads 39
Total Views 125

Summary

Big Data...


Description

CS989 / CS982: Big Data Fundamentals / Techniques Lab 2 – Data Handling and Manipulation Part 1 1. Import the necessary libraries (numpy and pandas) import numpy as np import pandas as pd 2. Import the Pokemon dataset available at https://www.kaggle.com/alopez247/pokemon I downloaded the dataset and used this command pokemon = pd.read_csv("C://Work Files//Teaching//Big Data// Datasets//pokemon_alopez247.csv") 3. Print the first 10 entries pokemon.head(10) 4. How many observations and columns are there? pokemon.shape 5. Print the names of all of the columns pokemon.columns

Part 2 1. Import the necessary libraries (numpy and pandas) import numpy as np import pandas as pd 2. Import the Pokemon dataset available at https://www.kaggle.com/alopez247/pokemon I downloaded the dataset and used this command pokemon = pd.read_csv("C://Work Files//Teaching//Big Data// Datasets//pokemon_alopez247.csv") 3. Sort by attack power from high to low pokemon.Attack.sort_values(ascending=False) 4. Describe the defence power for each Pokemon type

pokemon.groupby('Type_1').Defense.describe() 5. What are the mean, median, max and minimum of the total column for each Pokemon type? pokemon.groupby('Type_1').Total.agg(["mean", "min", "max", "median"]) 6. What is the most common Pokemon type? p = pokemon.groupby("Type_1").size().reset_index(name='counts') p = p.sort_values(['counts'], ascending=False) p.head(1) or pokemon["Type_1"].value_counts().idxmax()

Part 3 1. Import the necessary libraries (numpy and pandas) import numpy as np import pandas as pd 2. Import the Pokemon dataset available at https://www.kaggle.com/alopez247/pokemon I downloaded the dataset and used this command pokemon = pd.read_csv("C://Work Files//Teaching//Big Data// Datasets//pokemon_alopez247.csv") 3. Check if there are missing values pokemon.isnull().any() or pokemon.isnull().sum() 4. Remove the Type_2 and Egg_Group_2 columns pokemon = pokemon.drop("Type_2", axis=1) pokemon = pokemon.drop("Egg_Group_2", axis=1) pokemon 5. Drop any rows that have null values pokemon = pokemon.dropna() pokemon...


Similar Free PDFs