python - Assign unique id to columns pandas data frame -


hello have following dataframe

df =       b    john   tom homer  bart tom    maggie lisa   john  

i assign each name unique id , returns

df =       b         c    d  john   tom       0    1 homer  bart      2    3 tom    maggie    1    4  lisa   john      5    0 

what have done following:

ll1 = pd.concat([df.a,df.b],ignore_index=true) ll1 = pd.dataframe(ll1) ll1.columns=['a'] nameun = pd.unique(ll1.a.ravel()) llout['c'] = 0 llout['d'] = 0 nn = list(nameun) in range(1,len(llout)):    llout.c[i] = nn.index(llout.a[i])    llout.d[i] = nn.index(llout.b[i]) 

but since have large dataset process slow.

here's 1 way. first array of unique names:

in [11]: df.values.ravel() out[11]: array(['john', 'tom', 'homer', 'bart', 'tom', 'maggie', 'lisa', 'john'], dtype=object)  in [12]: pd.unique(df.values.ravel()) out[12]: array(['john', 'tom', 'homer', 'bart', 'maggie', 'lisa'], dtype=object) 

and make series, mapping names respective numbers:

in [13]: names = pd.unique(df.values.ravel())  in [14]: names = pd.series(np.arange(len(names)), names)  in [15]: names out[15]: john      0 tom       1 homer     2 bart      3 maggie    4 lisa      5 dtype: int64 

now use applymap , names.get lookup these numbers:

in [16]: df.applymap(names.get) out[16]:     b 0  0  1 1  2  3 2  1  4 3  5  0 

and assign correct columns:

in [17]: df[["c", "d"]] = df.applymap(names.get)  in [18]: df out[18]:              b  c  d 0   john     tom  0  1 1  homer    bart  2  3 2    tom  maggie  1  4 3   lisa    john  5  0 

note: assumes values names begin with, may want restrict columns only:

df[['a', 'b']].values.ravel() ... df[['a', 'b']].applymap(names.get) 

Comments

Popular posts from this blog

php - Invalid Cofiguration - yii\base\InvalidConfigException - Yii2 -

How to show in django cms breadcrumbs full path? -

ruby on rails - npm error: tunneling socket could not be established, cause=connect ETIMEDOUT -