Description
xref #13742 for addl cases.
In [23]: df1 = pd.DataFrame({'a':[1,2]}, index=pd.Index(['a', 'b'], name='idx'))
In [24]: df2 = pd.DataFrame({'b':[2,3]}, index=pd.Index(['b', 'c'], name='idx'))
In [26]: pd.concat([df1, df2], axis=1)
Out[26]:
a b
a 1.0 NaN
b 2.0 2.0
c NaN 3.0
In [27]: print pd.concat([df1, df2], axis=1).index.name
None
So the issue seems to be with a string index that is not equal, as when the index of the two frames is equal (no NaNs are introduced), the name is kept and also when using numerical indexes, see #13475 (comment)
When I use the concat function with input dataframes that have index.name assigned, sometimes the resulting dataframe has the index.name assigned, sometimes it does not.
I ran the code below from the python interpreter, using a conda environment with pandas-0.18.1
I don't see any odd / extra characters around the "pert_well" column in the files between the files.
Code Sample, a copy-pastable example if possible
import pandas
a_data = """x_amount_mg x_annotation x_mmoles_per_liter mfc_plate_name x_avg_mol_weight x_volume_ul pert_mfc_desc pert_iname x_purity pert_id_vendor pert_well pert_vehicle pert_mfc_id x_smiles x_mg_per_ml pert_dose_unit pert_dose pert_id pert_plate pert_type
0.04784 ACCEPT 10.0 B-REPO-01-B64-101 405.4084 11 Taltirelin Taltirelin 86.52 HY-B0596 C18 DMSO BRD-K93869735-001-01-1 CN1C(=O)C[C@H](NC1=O)C(=O)N[C@@H](Cc1cnc[nH]1)C(=O)N1CCC[C@H]1C(N)=O 4.054084 um 20.0 BRD-K93869735 PMEL008 trt_cp"""
b_data = """pert_well pert_2_type pert_2_id pert_2_mfc_id pert_2_mfc_desc pert_2_id_vendor pert_2_iname pert_2_dose pert_2_dose_unit pert_2_vehicle pert_3_type pert_3_idpert_3_mfc_id pert_3_mfc_desc pert_3_id_vendor pert_3_iname pert_3_dose pert_3_dose_unit pert_3_vehicle
A01 ctl_vehicle DMSO DMSO DMSO -666 DMSO -666 -666 -666 ctl_untrt CMAP-000 -666 UnTrt -666 -666 -666 -666 -666"""
d_data = """x_amount_mg x_annotation x_mmoles_per_liter mfc_plate_name x_avg_mol_weight x_volume_ul pert_mfc_desc pert_iname x_purity pert_id_vendor pert_well pert_vehicle pert_mfc_id x_smiles x_mg_per_ml pert_dose_unit pert_dose pert_id pert_plate pert_type
0.0 -666 -666 B-REPO-01-B64-107 -666 0 -666 -666 -666 -666 A01 -666 -666 -666 -666 -666 -666 CMAP-000 PMEL001 ctl_untrt"""
a = pandas.read_csv(StringIO(a_data), sep="\t", index_col="pert_well")
b = pandas.read_csv(StringIO(b_data), sep="\t", index_col="pert_well")
c = pandas.concat([a,b], axis=1)
c.index
d = pandas.read_csv(StringIO(d_data), sep="\t", index_col="pert_well")
e = pandas.concat([d,b], axis=1)
e.index
results:
Index([u'A01', u'A02', u'A03', u'A04', u'A05', u'A06', u'A07', u'A08', u'A09',
u'A10',
...
u'P15', u'P16', u'P17', u'P18', u'P19', u'P20', u'P21', u'P22', u'P23',
u'P24'],
dtype='object', length=384)
Index([u'A01', u'A02', u'A03', u'A04', u'A05', u'A06', u'A07', u'A08', u'A09',
u'A10',
...
u'P15', u'P16', u'P17', u'P18', u'P19', u'P20', u'P21', u'P22', u'P23',
u'P24'],
dtype='object', name=u'pert_well', length=384)
Expected Output
c.index.name should be "pert_well"
output of pd.show_versions()
INSTALLED VERSIONS
commit: None
python: 2.7.11.final.0
python-bits: 64
OS: Linux
OS-release: 2.6.32-573.7.1.el6.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: C
pandas: 0.18.1
nose: None
pip: 8.1.2
setuptools: 23.0.0
Cython: None
numpy: 1.11.0
scipy: None
statsmodels: None
xarray: None
IPython: None
sphinx: None
patsy: None
dateutil: 2.5.3
pytz: 2016.4
blosc: None
bottleneck: None
tables: None
numexpr: None
matplotlib: None
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: None
boto: None
pandas_datareader: None