read_excel() modifies provided types dict when accessing file with duplicate column · Issue #42462 · pandas-dev/pandas (original) (raw)

test.xlsx :

a a b c
1 1 b1 c1
2 2 b2 c2
3 3 b3 c3
import pandas as pd


types_dict = {'a': str,
             'b': str,
             'c': str,
             }


if __name__ == "__main__":
    df = pd.read_excel('./test.xlsx', dtype=type_dict)
    print(list(type_dict.keys()))
>> ['a', 'b', 'c', 'a.1']

Bug/Issue description:
When using dtype loading a .xlsx-file with a duplicate column into a dataframe modifies the provided types_dict / adds entries for duplicate columns.

It seems to me like the modification of the types_dict is an unwanted side effect.