Pandas is a open-source library which is built on Numpy for data manipulation by Wes McKinney. Data Frames are the key data structures in Pandas, which allows you to store and manipulate tabular data in rows of observations and columns of variables.
Pandas is a very powerful library which has many features to help data scientists in data manipulation and analysis. some of the key features are as below:
- Handles missing data and data slicing efficiently
- Uses Series for 1D data structure and DataFrames for multi-dimensional data structures
- Offers flexibility to merge, concatenate or manipulate the data
- Pandas is one of the best solutions to deal with time series data.
A DataFrame is a two-dimensional data structure, means the data is aligned into rows and columns. DataFrames are the standard way to store the data. They are size-mutable, potentially heterogeneous tabular data.
There are multiple ways to create DataFrames.
import pandas as pd
dict1 = {"country": ["USA", "Mexico", "India", "Australia","China", "Indonesia"],
"language": ["English", "spanish", "Hindi", "English", "Chinese", "Indonesian"]}
df = pd.DataFrame(dict)
print(df)
Gives results as below:
country language
0 USA English
1 Mexico spanish
2 India Hindi
3 Australia English
4 China Chinese
5 Indonesia Indonesian
import pandas as pd
list1 = [1,2,3,4,5,6,7,8,9,10]
df = pd.DataFrame(list1)
print df
You can also import csv files to create DataFrames. Consider you have example.csv stored and can be imported using Pandas using pd.read_csv().
import pandas as pd
data = pd.read_csv('example.csv') # reads example.csv csv file
print(data)
You can also import csv files to create DataFrames. Consider you have example.csv stored and can be imported using Pandas using pd.read_csv().
import pandas as pd
data = pd.read_excel('example.xlsx') # reads example.xlsx xlsx file
print(data)
you can drop a column using drop() method
import pandas as pd
dict1 = {"country": ["USA", "Mexico", "India", "Australia","China", "Indonesia"],
"language": ["English", "spanish", "Hindi", "English", "Chinese", "Indonesian"]}
df = pd.DataFrame(dict)
df.drop("country",axis=1)
you can export a dataframe to csv file using to_csv() method
df.to_csv("output.csv")