Pandas DataFrames are 2 dimensional data tables used for storing and representing data using rows & columns. Like any other 2 dimensional data table, a Pandas DataFrame uses rows to store the data instances and columns to store the data values in each instance. Let’s understand more with an example.

A firm wants to store the following details for all it’s clients: Client Name, Annual Revenue & Profit Percentage. The below table is an example of how it would look like:
Serial No. | Client Name | Revenue | Profit % |
1 | Client_A | $300,000 | 25% |
2 | Client_B | $250,000 | 32% |
3 | Client_C | $ 180,000 | 20% |
4 | Client_D | $320,000 | 33% |
You may have noticed how the data table is structured:
- The field names are stored as columns names (e.g. Client Name, Revenue etc.).
- The rows are identified by serial numbers (e.g. 1, 2, 3 etc.)
- Based on the serial numbers (row names) & field names (column names) , the data is populated.
Pandas DataFrames allow you to store data in a similar way. Most importantly you can store a DataFrame in a single variable. Technically speaking a Pandas DataFrame is an object. There are a whole bunch of methods available to manipulate the DataFrames.
In other words, with single lines of codes you can flexibly cut and slice the data for better views and analysis. For example, you can drop columns or rows from Pandas DataFrames. Before getting into more details about data handling using DataFrames, let’s understand how to create a DataFrame in python. You can check some of the other articles in this blog to learn more about manipulating Pandas DataFrames using available functions (methods).
Creating DataFrame using Pandas Library
To use DataFrames in Python, you need to first install Pandas library. Once the installation is completed you need to import the DataFrame object from Pandas. Here’s the code to get the import done :
from pandas import DataFrame
The DataFrame object accepts the following attributes : data , index, columns, dtype & copy. To get started , you really don’t need to understand all these attributes. The most frequently used ones are “data” , “columns” & “index“. The “data” attribute is used for storing the data, the “columns” attribute is used for storing the column names & the “index” attribute is used for storing the “row names“. Do note that you can create a DataFrame without explicitly passing values for “columns” & “index“. If you don’t pass values, the columns & rows will be named by the index values (i.e. 0,1,2,34 …).
In fact you can choose to not pass any attribute (including data). If you create a DataFrame without any attribute, you will be essentially creating an empty DataFrame.
Understanding data, columns & index attributes in Pandas DataFrames:
Python considers the data in each row in a DataFrame as a list. Thus the complete data in a DataFrame can be represented as a “list of lists“. We can pass a list of lists as the value for the data attribute while creating a DataFrame. Let’s try to create a DataFrame using the data from the above example by storing the data first as a list of lists:
from pandas import DataFrame #Storing each client info in a separate list L1= ['Client A', '$300,000', '25%'] L2= ['Client B', '$250,000', '32%'] L3= ['Client C', '$180,000', '20%'] L4= ['Client D', '$320,000', '33%'] #Creating a list of lists all_data=[L1,L2,L3,L4] #Passing the list of lists as Data attribute to a DataFrame df=DataFrame(all_data) df
The output looks like:
0 1 2 0 Client A $300,000 25% 1 Client B $250,000 32% 2 Client C $180,000 20% 3 Client D $320,000 33%
So we have the data in the format that we need but the rows and columns are yet to be named.
To name the columns we need to use the “columns” attribute. The columns attribute accepts the column names as a list. We can update the above code to add the column names as shown below:
from pandas import DataFrame L1= ['Client A', '$300,000', '25%'] L2= ['Client B', '$250,000', '32%'] L3= ['Client C', '$180,000', '20%'] L4= ['Client D', '$320,000', '33%'] all_data=[L1,L2,L3,L4] df=DataFrame(all_data,columns=['Client Name','Revenue','Profit %']) df
The DataFrame that we now get will come with the column names:
Client Name Revenue Profit % 0 Client A $300,000 25% 1 Client B $250,000 32% 2 Client C $180,000 20% 3 Client D $320,000 33%
Similarly we can create a list of row names and pass it to the index attribute to add row names to the DataFrame.
There are 2 other interesting ways to create Pandas DataFrames :
- You can create a Pandas DataFrame from a python dictionary
- You can create a Pandas DataFrame from an external data source (e.g. an excel file)
Converting a Python Dictionary to a Pandas DataFrame
A Python Dictionary stores data in the key:value format. To create a Pandas DataFrame from a Python Dictionary , we need to ensure that the dictionary has the column names as keys & and the list of values as the values.
The following dictionary can be converted to the DataFrame shown in the above example:
d ={'Client Name': ['Client_A', 'Client_B', 'Client_C', 'Client_D'], 'Profit %': ['25%', '32%', '20%', '33%'], 'Revenue': ['$300,000', '$250,000', '$180,000', '$320,000']}
To convert a dictionary to a DataFrame, we will use the from_dict method :
df1 = DataFrame.from_dict(d)
Creating Pandas DataFrames from external excel or csv
You can easily convert an excel or CSV to a Pandas DataFrame. To learn more do read my article on using pandas to read excel or csv. It’s also important to note that Pandas also allows you easily export dataframes to excel or CSV. Thus, Pandas can be used seamlessly to analyze excel or CSV data.
FAQs: Pandas DataFrames
You can create an empty DataFrame in Pandas using the following code-
from Pandas import DataFrame
df = DataFrame ()
You can create DataFrames in Pandas from External Data Sources, for Python Dictionaries & from Python list of lists.


Excellent way of explaining, and nice piece of writing to obtain facts regarding
my presentation subject matter, which i am going
to convey in school.