We look at 3 ways to rename columns of a pandas dataframe. The first approach is the regular brute-force approach. The rest 2 approaches are pythonic ways of improving the applicability and efficiency of the code.
When the column names have to be mapped to a set that bears no relationship to the previous column names then we have little choice but to type the entire dictionary of mapping.
df = df.rename(columns={'SIS Login ID': 'email', 'SIS User ID': 'id',
'Current Score': 'grade', 'Section': 'course'})
When a relationship between the old and the new column names exists then the process can be sped up using dict comprehension. Let’s say we want to add an extra string ‘old’ at the end of each column name. We first create a new list of column names from the old ones.
cols = df1.columns
cols_new = [x + ' old' for x in cols]
['MATH205 - Chapter 1 and 2 Test (380919) old',
'MATH205 - Chapter 3 and 4 Test (380916) old',
'MATH205 - Chapter 5 Test (380926) old',
'MATH205 - Chapter 6 Test (380900) old',
'MATH205 - Chapter 7 Test (380902) old',
'MATH205 - Chapter 8 Test (380938) old',
'MATH205 - Chapter 9 Test (380922) old',
'MATH205 - Chapter 10 Test (380929) old']
We then create the mapping.
mapping = {key1: key2 for key1, key2 in zip(cols, cols_new)}
{'MATH205 - Chapter 1 and 2 Test (380919)': 'MATH205 - Chapter 1 and 2 Test (380919) old',
'MATH205 - Chapter 3 and 4 Test (380916)': 'MATH205 - Chapter 3 and 4 Test (380916) old',
'MATH205 - Chapter 5 Test (380926)': 'MATH205 - Chapter 5 Test (380926) old',
'MATH205 - Chapter 6 Test (380900)': 'MATH205 - Chapter 6 Test (380900) old',
'MATH205 - Chapter 7 Test (380902)': 'MATH205 - Chapter 7 Test (380902) old',
'MATH205 - Chapter 8 Test (380938)': 'MATH205 - Chapter 8 Test (380938) old',
'MATH205 - Chapter 9 Test (380922)': 'MATH205 - Chapter 9 Test (380922) old',
'MATH205 - Chapter 10 Test (380929)': 'MATH205 - Chapter 10 Test (380929) old'}
Here, we have used dict comprehension to generate the key-value pairs. The key is the old colmn name and the value is the new column name. We can now rename the column names using this mapping.
df1.rename(columns=mapping)
MATH205 - Chapter 1 and 2 Test (380919) old | MATH205 - Chapter 3 and 4 Test (380916) old | MATH205 - Chapter 5 Test (380926) old | MATH205 - Chapter 6 Test (380900) old | MATH205 - Chapter 7 Test (380902) old | MATH205 - Chapter 8 Test (380938) old | MATH205 - Chapter 9 Test (380922) old | MATH205 - Chapter 10 Test (380929) old |
---|---|---|---|---|---|---|---|
0 | 53.13 | 53.13 | 53.13 | 53.13 | 53.13 | 53.13 | 53.13 |
1 | 47.34 | 48.72 | 38.73 | 34.43 | 30.39 | 46.17 | 47.50 |
2 | 50.63 | 0.00 | 43.09 | 25.18 | 7.49 | 19.87 | 4.25 |
When a relationship between the old and the new column names exists,
but it is complicated enough that we have to use a regular expression (regex). Here we will
use lambda function to implement the regex using the re
module to perform complicated string manipulations.
MATH205 - Chapter 1 and 2 Test (380919)
to Chapter 1 and 2 Test
We have to strip out the first part and the last part. This can be achieved using a regex. Let’s first strip the first part.
cols_test = list(map(lambda x: re.sub('MATH205 - ','', x), cols_test))
['Chapter 1 and 2 Test (380919)',
'Chapter 3 and 4 Test (380916)',
'Chapter 5 Test (380926)',
'Chapter 6 Test (380900)',
'Chapter 7 Test (380902)',
'Chapter 8 Test (380938)',
'Chapter 9 Test (380922)',
'Chapter 10 Test (380929)']
Now, let’s strip the last part.
cols_test = list(map(lambda x: re.sub(' \(\d+\)','', x), cols_test))
['Chapter 1 and 2 Test',
'Chapter 3 and 4 Test',
'Chapter 5 Test',
'Chapter 6 Test',
'Chapter 7 Test',
'Chapter 8 Test',
'Chapter 9 Test',
'Chapter 10 Test']
This is the new list which will replace the old column names.
This part is similar to what we did in Section 2. We will create a mapping and use
that mapping in the rename()
function of the pandas dataframe.
mapping = {key1: key2 for key1, key2 in zip(cols_test, cols_test_new)}
{'MATH205 - Chapter 1 and 2 Test (380919)': 'Chapter 1 and 2 Test',
'MATH205 - Chapter 3 and 4 Test (380916)': 'Chapter 3 and 4 Test',
'MATH205 - Chapter 5 Test (380926)': 'Chapter 5 Test',
'MATH205 - Chapter 6 Test (380900)': 'Chapter 6 Test',
'MATH205 - Chapter 7 Test (380902)': 'Chapter 7 Test',
'MATH205 - Chapter 8 Test (380938)': 'Chapter 8 Test',
'MATH205 - Chapter 9 Test (380922)': 'Chapter 9 Test',
'MATH205 - Chapter 10 Test (380929)': 'Chapter 10 Test'}
Here, we have used dict comprehension to generate the key-value pairs. The key is the old colmn name and the value is the new column name. We can now rename the column names using this mapping.
df1.rename(columns=mapping)
| Chapter 1 and 2 Test | Chapter 3 and 4 Test | Chapter 5 Test | Chapter 6 Test | Chapter 7 Test | Chapter 8 Test | Chapter 9 Test | Chapter 10 Test |
|---------------------:|---------------------:|---------------:|---------------:|---------------:|---------------:|---------------:|----------------:|
| 53.13 | 53.13 | 53.13 | 53.13 | 53.13 | 53.13 | 53.13 | 53.09 |
| 47.34 | 48.72 | 38.73 | 34.43 | 30.39 | 46.17 | 47.50 | 32.76 |
| 50.63 | 0.00 | 43.09 | 25.18 | 7.49 | 19.87 | 4.25 | 18.21 |