Skip to content

Commit 7a7279d

Browse files
authored
v0.2.1 (#26)
1 parent 802f1a6 commit 7a7279d

File tree

8 files changed

+155
-80
lines changed

8 files changed

+155
-80
lines changed

biopandas/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,5 +4,5 @@
44
# Project Website: http://rasbt.github.io/biopandas/
55
# Code Repository: https://github.com/rasbt/biopandas
66

7-
__version__ = '0.2.1.dev0'
7+
__version__ = '0.2.1'
88
__author__ = "Sebastian Raschka <[email protected]>"

docs/sources/CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
The CHANGELOG for the current development version is available at
44
[https://github.com/rasbt/biopandas/blob/master/docs/sources/CHANGELOG.md](https://github.com/rasbt/biopandas/blob/master/docs/sources/CHANGELOG.md).
55

6-
### 0.2.1dev
6+
### 0.2.1 (2017-05-11)
77

88
##### Downloads
99

docs/sources/CONTRIBUTING.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -262,7 +262,7 @@ For example,
262262
Please note that documents containing code examples are generated from IPython Notebook files and converted to markdown via
263263

264264
```bash
265-
~/github/biopandas/docs/examples$ nbconvert --to markdown <file.ipynb>
265+
~/github/biopandas/docs/sources/tutorials$ nbconvert --to markdown <file.ipynb>
266266
```
267267

268268
The markdown file should be placed into the documentation directory at `biopandas/docs/sources` to build the documentation via MkDocs.
@@ -349,7 +349,7 @@ $ pip uninstall biopandas
349349
Consider deploying the package to the PyPI test server first. The setup instructions can be found [here](https://wiki.python.org/moin/TestPyPI).
350350

351351
```bash
352-
$ python setup.py sdist upload -r https://testpypi.python.org/pypi
352+
$ python setup.py sdist bdist_wheel upload -r https://testpypi.python.org/pypi
353353
```
354354

355355
Test if it can be installed from there by executing
@@ -367,7 +367,7 @@ $ pip uninstall biopandas
367367
After this dry-run succeeded, repeat this process using the "real" PyPI:
368368

369369
```bash
370-
$ python setup.py sdist upload
370+
$ python setup.py sdist bdist_wheel upload
371371
```
372372

373373
#### 4. Removing the virtual environment

docs/sources/api_subpackages/biopandas.mol2.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
biopandas version: 0.2.1.dev0
1+
biopandas version: 0.2.1
22
## PandasMol2
33

44
*PandasMol2()*

docs/sources/api_subpackages/biopandas.pdb.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
biopandas version: 0.2.1.dev0
1+
biopandas version: 0.2.1
22
## PandasPdb
33

44
*PandasPdb()*
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
biopandas version: 0.2.1.dev0
1+
biopandas version: 0.2.1

docs/sources/tutorials/Working_with_MOL2_Structures_in_DataFrames.md

Lines changed: 29 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,31 @@
1+
2+
BioPandas
3+
4+
Author: Sebastian Raschka <[email protected]>
5+
License: BSD 3 clause
6+
Project Website: http://rasbt.github.io/biopandas/
7+
Code Repository: https://github.com/rasbt/biopandas
8+
9+
10+
```python
11+
%load_ext watermark
12+
%watermark -d -u -p pandas,biopandas
13+
```
14+
15+
last updated: 2017-04-02
16+
17+
pandas 0.19.2
18+
biopandas 0.2.0.dev0
19+
20+
21+
22+
```python
23+
from biopandas.mol2 import PandasMol2
24+
import pandas as pd
25+
pd.set_option('display.width', 600)
26+
pd.set_option('display.max_columns', 8)
27+
```
28+
129
# Working with MOL2 Structures in DataFrames
230

331
The Tripos MOL2 format is a common format for working with small molecules. In this tutorial, we will go over some examples that illustrate how we can use Biopandas' MOL2 DataFrames to analyze molecules conveniently.
@@ -569,7 +597,7 @@ A list of all the allowed atom types that can be found in Tripos MOL2 files is p
569597
S.3 sulfur sp3
570598
S.2 sulfur sp2
571599
S.O sulfoxide sulfur
572-
S.O2 sulfone sulfur
600+
S.O2/S.o2 sulfone sulfur
573601
P.3 phosphorous sp3
574602
F fluorine
575603
H hydrogen

docs/sources/tutorials/Working_with_PDB_Structures_in_DataFrames.md

Lines changed: 118 additions & 71 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,31 @@
1+
2+
BioPandas
3+
4+
Author: Sebastian Raschka <[email protected]>
5+
License: BSD 3 clause
6+
Project Website: http://rasbt.github.io/biopandas/
7+
Code Repository: https://github.com/rasbt/biopandas
8+
9+
10+
```python
11+
%load_ext watermark
12+
%watermark -d -u -p pandas,biopandas
13+
```
14+
15+
last updated: 2017-04-12
16+
17+
pandas 0.19.2
18+
biopandas 0.2.1.dev0
19+
20+
21+
22+
```python
23+
from biopandas.pdb import PandasPdb
24+
import pandas as pd
25+
pd.set_option('display.width', 600)
26+
pd.set_option('display.max_columns', 8)
27+
```
28+
129
# Working with PDB Structures in DataFrames
230

331
## Loading PDB Files
@@ -29,7 +57,7 @@ ppdb.read_pdb('./data/3eiy.pdb')
2957

3058

3159

32-
<biopandas.pdb.pandas_pdb.PandasPdb at 0x106795898>
60+
<biopandas.pdb.pandas_pdb.PandasPdb at 0x10462bf28>
3361

3462

3563

@@ -45,7 +73,7 @@ ppdb.read_pdb('./data/3eiy.pdb.gz')
4573

4674

4775

48-
<biopandas.pdb.pandas_pdb.PandasPdb at 0x106795898>
76+
<biopandas.pdb.pandas_pdb.PandasPdb at 0x10462bf28>
4977

5078

5179

@@ -207,7 +235,7 @@ ppdb.df.keys()
207235

208236

209237

210-
dict_keys(['HETATM', 'ANISOU', 'ATOM', 'OTHERS'])
238+
dict_keys(['ATOM', 'HETATM', 'ANISOU', 'OTHERS'])
211239

212240

213241

@@ -1142,81 +1170,100 @@ Residues in the `residue_name` field can be converted into 1-letter amino acid c
11421170

11431171
```python
11441172
from biopandas.pdb import PandasPdb
1145-
ppdb = PandasPdb().read_pdb('./data/3eiy.pdb.gz')
1146-
ppdb.amino3to1()
1147-
# By default, `amino3to1` returns a pandas Series object,
1148-
# and to convert it into a Python list, you can wrap it in list
1149-
# constructor, e.g.,
1150-
# `list(ppdb.amino3to1())`
1173+
ppdb = PandasPdb().fetch_pdb('5mtn')
1174+
sequence = ppdb.amino3to1()
1175+
sequence.tail()
11511176
```
11521177

11531178

11541179

11551180

1156-
0 S
1157-
6 F
1158-
17 S
1159-
23 N
1160-
31 V
1161-
38 P
1162-
45 A
1163-
50 G
1164-
54 K
1165-
63 D
1166-
71 L
1167-
79 P
1168-
86 Q
1169-
95 D
1170-
103 F
1171-
114 N
1172-
122 V
1173-
129 I
1174-
137 I
1175-
145 E
1176-
154 I
1177-
162 P
1178-
169 A
1179-
174 Q
1180-
183 S
1181-
189 E
1182-
198 P
1183-
205 V
1184-
212 K
1185-
221 Y
1186-
..
1187-
1100 E
1188-
1109 K
1189-
1114 G
1190-
1118 K
1191-
1127 W
1192-
1141 V
1193-
1148 K
1194-
1153 V
1195-
1160 E
1196-
1169 G
1197-
1173 W
1198-
1187 D
1199-
1195 G
1200-
1199 I
1201-
1207 D
1202-
1215 A
1203-
1220 A
1204-
1225 H
1205-
1235 K
1206-
1244 E
1207-
1253 I
1208-
1261 T
1209-
1268 D
1210-
1276 G
1211-
1280 V
1212-
1287 A
1213-
1292 N
1214-
1300 F
1215-
1311 K
1216-
1320 K
1217-
Name: residue_name, dtype: object
1181+
<div>
1182+
<table border="1" class="dataframe">
1183+
<thead>
1184+
<tr style="text-align: right;">
1185+
<th></th>
1186+
<th>chain_id</th>
1187+
<th>residue_name</th>
1188+
</tr>
1189+
</thead>
1190+
<tbody>
1191+
<tr>
1192+
<th>1378</th>
1193+
<td>B</td>
1194+
<td>I</td>
1195+
</tr>
1196+
<tr>
1197+
<th>1386</th>
1198+
<td>B</td>
1199+
<td>N</td>
1200+
</tr>
1201+
<tr>
1202+
<th>1394</th>
1203+
<td>B</td>
1204+
<td>Y</td>
1205+
</tr>
1206+
<tr>
1207+
<th>1406</th>
1208+
<td>B</td>
1209+
<td>R</td>
1210+
</tr>
1211+
<tr>
1212+
<th>1417</th>
1213+
<td>B</td>
1214+
<td>T</td>
1215+
</tr>
1216+
</tbody>
1217+
</table>
1218+
</div>
1219+
1220+
1221+
1222+
As shown above, the `amino3to1` method returns a `DataFrame` containing the `chain_id` and `residue_name` of the translated 1-letter amino acids. If you like to work with the sequence as a Python list of string characters, you could do the following:
1223+
1224+
1225+
```python
1226+
sequence_list = list(sequence.loc[sequence['chain_id'] == 'A', 'residue_name'])
1227+
sequence_list[-5:] # last 5 residues of chain A
1228+
```
1229+
1230+
1231+
1232+
1233+
['V', 'R', 'H', 'Y', 'T']
1234+
1235+
1236+
1237+
And if you prefer to work with the sequence as a string, you can use the `join` method:
12181238

12191239

1240+
```python
1241+
''.join(sequence.loc[sequence['chain_id'] == 'A', 'residue_name'])
1242+
```
1243+
1244+
1245+
1246+
1247+
'SLEPEPWFFKNLSRKDAERQLLAPGNTHGSFLIRESESTAGSFSLSVRDFDQGEVVKHYKIRNLDNGGFYISPRITFPGLHELVRHYT'
1248+
1249+
1250+
1251+
To iterate over the sequences of multi-chain proteins, you can use the `unique` method as shown below:
1252+
1253+
1254+
```python
1255+
for chain_id in sequence['chain_id'].unique():
1256+
print('\nChain ID: %s' % chain_id)
1257+
print(''.join(sequence.loc[sequence['chain_id'] == chain_id, 'residue_name']))
1258+
```
1259+
1260+
1261+
Chain ID: A
1262+
SLEPEPWFFKNLSRKDAERQLLAPGNTHGSFLIRESESTAGSFSLSVRDFDQGEVVKHYKIRNLDNGGFYISPRITFPGLHELVRHYT
1263+
1264+
Chain ID: B
1265+
SVSSVPTKLEVVAATPTSLLISWDAPAVTVVYYLITYGETGSPWPGGQAFEVPGSKSTATISGLKPGVDYTITVYAHRSSYGYSENPISINYRT
1266+
12201267

12211268
## Wrapping it up - Saving PDB structures
12221269

0 commit comments

Comments
 (0)