You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The picks.csv output from PhaseNet contains some clunky formatting that requires the user to perform several string manipulations to properly format the itp, tp_prob, its, ts_prob columns.
I will show an example of reading the csv with pandas although reading the csv with the csv package runs into the same formatting issues . I will also share the function I had to make to correctly format the entries.
Pandas
import pandas as pd
df = pd.read_csv('output/picks.csv')
The result is a dataframe containing strings in the itp, tp_prob, its, ts_prob columns.
The values are not uniformly separated either which means the str.split() method can't be applied to convert the string into a list. Ideally, the csv would contain a uniform, comma-separated list of values. Another solution would be to also save a pickle file to the output directory that contains the lists in object form.
To fix the formatting with the current picks.csv, I made the following function:
import shlex
import pandas as pd
df = pd.read_csv('output/picks.csv')
def pickConverter(df):
for col in ['itp', 'its']:
pick_entry_list = []
for x in range(len(df)):
try:
pick_entry_list.append(list(map(int, shlex.split(df[col][x].strip('[]')))))
except AttributeError:
pick_entry_list.append([])
pass
df[col] = pick_entry_list
for col in ['tp_prob', 'ts_prob']:
prob_entry_list = []
for x in range(len(df)):
try:
prob_entry_list.append(list(map(float, shlex.split(df[col][x].strip('[]')))))
except AttributeError:
prob_entry_list.append([])
pass
df[col] = prob_entry_list
return df
The text was updated successfully, but these errors were encountered:
I described this issue in under AI4EPS#9 and here is the PR with the fix. The fix is a two-part solution.
First, I open the picks.csv file as fclog with the csv library and write the header row. I then open picks.csv in the append mode and write the results to picks.csv batch by batch. The results are converted from arrays to lists (this removes the empty white spaces), and then written to the row in a list of results instead of a single string formatted with the results. This also fixes the new-line issue I had opened and resolved earlier.
I have tried and tested the method and it works as hoped.
Best,
Lenni
Thanks for the suggestion! I have updated the csv format without extra spaces. I have also added a pickle format which could be directly loaded for post processing. #11
The picks.csv output from PhaseNet contains some clunky formatting that requires the user to perform several string manipulations to properly format the
itp
,tp_prob
,its
,ts_prob
columns.I will show an example of reading the csv with
pandas
although reading the csv with thecsv
package runs into the same formatting issues . I will also share the function I had to make to correctly format the entries.Pandas
The result is a dataframe containing strings in the
itp
,tp_prob
,its
,ts_prob
columns.The values are not uniformly separated either which means the
str.split()
method can't be applied to convert the string into a list. Ideally, the csv would contain a uniform, comma-separated list of values. Another solution would be to also save a pickle file to the output directory that contains the lists in object form.To fix the formatting with the current picks.csv, I made the following function:
The text was updated successfully, but these errors were encountered: