In this tutorial, I will demonstrate the various ways of dropping columns and rows from Pandas DataFrames. To follow along you need to ensure that you have Pandas installed and you are familiar with creating a basic DataFrame using Pandas in Python.
Drop Columns and Rows in Pandas by name
To begin with let’s consider the following DataFrame that includes Name, Age , Home Ownership & Car Ownership of four individuals:
Name Age Has Home Has Car 0 Tom 24 Yes Yes 1 Ann 30 Yes Yes 2 Harry 29 No No 3 George 64 No Yes 4 Ali 46 Yes Yes 5 Sharma 31 Yes Yes
Drop Columns by name:
To drop columns, we would be using the “drop” method in DataFrame object. Here’s what the code looks like:
#To drop the "Age" Column: df.drop(["Age"],axis=1)
Note the following:
1. The Columns to be dropped are passed as a list. In this case, we just want to drop the “Age” column.
2. The axis value is set to 1, when we want to access columns. If we wanted to delete rows (by index value), the axis would be set to 0.
3. Do note that the “drop” method doesn’t modify the original DataFrame but it creates a new DataFrame. In other words, if accessed “df” again in the code we would note that all the original columns are still there. We can assign the new DataFrame to a different variable.
Drop Rows by name or index value:
Similar to the code explained above, to drop rows by name or index value we need to pass the values as a list. The axis needs to be set to 0.
# to drop rows 3 & 4: df.drop([3,4],axis=0)
The output DataFrame is as follows:
Name Age Has Home Has Car 0 Tom 24 Yes Yes 1 Ann 30 Yes Yes 2 Harry 29 No No 5 Sharma 31 Yes Yes
Note the following:
1. Do not confuse between the row-names & the column called “Name” . In this case the rows are marked by index values & not names.
2. The index values don’t change after the select rows are dropped.
3. As mentioned earlier, the earlier DataFrame stays unchanged.
Drop Columns and Rows in Pandas by Condition
We can also drop or select columns and rows by condition. The process is slightly different and is described below:
1. Let’s consider that we wan’t to remove all rows where the “Has Car” value is “No”. It’s similar to creating a filtered view by applying a filter in excel workbook.
2. There are 2 ways to get it done: Either we select the rows where the “Has Car” value is yes or we select the rows where the “Has Car” value is not equal to “No”.
3. The code would be as follows:
df[df[“Has Car”]==”Yes”] OR df[df[“Has Car”] !=”No”]
# Select rows where "Has Car" value is "yes" df[df["Has Car"]=="Yes"] # Select rows where "Has Car" value not "no" df[df["Has Car"] !="No"]
In both cases the output DataFrame is as follows:
Name Age Has Home Has Car 0 Tom 24 Yes Yes 1 Ann 30 Yes Yes 3 George 64 No Yes 4 Ali 46 Yes Yes 5 Sharma 31 Yes Yes
We have successfully removed the row where the “Has Car” value was “Yes”. Do note that in this case, though the original DataFrame is unchanged, in the new DataFrame the row indexes have been updated (different from what we saw when we used the “drop” method.
Drop Columns and Rows in Pandas by Multiple Conditions
Can we remove data from a Pandas DataFrame based on multiple conditions? You can. You need to use the following operators for AND & OR condition:
a. AND condition: Use “&” operator
b. OR condition : Use “|” operator.
Let’s consider a case, from the above DataFrame, we want to select the rows that meet the following conditions:
1. “Has Car” value is “Yes”
2. Age is above 25
Check the following code:
df[(df["Has Car"]=="Yes") & (df["Age"]>=25)]
The code results in the following DataFrame:
Name Age Has Home Has Car 1 Ann 30 Yes Yes 3 George 64 No Yes 4 Ali 46 Yes Yes 5 Sharma 31 Yes Yes
Hope you enjoyed learning the various methods to filter out data from a DataFrame. These processes become very important especially when we have a large volume of data to handle and there is a lot of cleansing to be done. Moreover, you can also export Pandas DataFrames to excel. That makes it easy to send the filtered data to anyone without worrying whether the receiver has Python and Pandas installed.