Pandas is a open-source library which is built on Numpy for data manipulation by Wes McKinney. Data Frames are the key data structures in Pandas, which allows you to store and manipulate tabular data in rows of observations and columns of variables.

Why you should use Pandas?

Pandas is a very powerful library which has many features to help data scientists in data manipulation and analysis. some of the key features are as below:

Handles missing data and data slicing efficiently
Uses Series for 1D data structure and DataFrames for multi-dimensional data structures
Offers flexibility to merge, concatenate or manipulate the data
Pandas is one of the best solutions to deal with time series data.

What is a DataFrame

A DataFrame is a two-dimensional data structure, means the data is aligned into rows and columns. DataFrames are the standard way to store the data. They are size-mutable, potentially heterogeneous tabular data.

How to create DataFrame

There are multiple ways to create DataFrames.

Using Dictionaties

import pandas as pd
dict1 = {"country": ["USA", "Mexico", "India", "Australia","China", "Indonesia"],
       "language": ["English", "spanish", "Hindi", "English", "Chinese", "Indonesian"]}

df = pd.DataFrame(dict)
print(df)

Gives results as below:

     country    language
0        USA     English
1     Mexico     spanish
2      India       Hindi
3  Australia     English
4      China     Chinese
5  Indonesia  Indonesian

Using Lists

import pandas as pd
list1 = [1,2,3,4,5,6,7,8,9,10]
df = pd.DataFrame(list1)
print df

import from csv files

You can also import csv files to create DataFrames. Consider you have example.csv stored and can be imported using Pandas using pd.read_csv().

import pandas as pd

data = pd.read_csv('example.csv') # reads example.csv csv file

print(data)

import from excel files

You can also import csv files to create DataFrames. Consider you have example.csv stored and can be imported using Pandas using pd.read_csv().

import pandas as pd

data = pd.read_excel('example.xlsx') # reads example.xlsx xlsx file

print(data)

dropping a column

you can drop a column using drop() method

import pandas as pd
dict1 = {"country": ["USA", "Mexico", "India", "Australia","China", "Indonesia"],
       "language": ["English", "spanish", "Hindi", "English", "Chinese", "Indonesian"]}

df = pd.DataFrame(dict)
df.drop("country",axis=1)

exporting a dataframe to csv

you can export a dataframe to csv file using to_csv() method

df.to_csv("output.csv")

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!