Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot Get Multiple DoA Using Grid Search #268

Open
johntcuellar opened this issue Jul 1, 2022 · 4 comments
Open

Cannot Get Multiple DoA Using Grid Search #268

johntcuellar opened this issue Jul 1, 2022 · 4 comments

Comments

@johntcuellar
Copy link

johntcuellar commented Jul 1, 2022

Hi FakuFaku,

I cannot seem to get grid search to work for multiple signal signals. I used Issue #166 as a starting point, however, I am unsure of where to proceed from here. Please see my code below. Thank you. Also, please excuse the messy code. It is a work in progress.

In this example, I set two signals in the room. The input azimuths are 19.5 and 90 degrees and the colatitudes are 38 and 57 degrees. The output I get is as follows. As you can see, the colatitudes are way off:

Recovered azimuth: [89.88361159 20.28753509] degrees
  Error: [70.38361159  0.78753509] degrees
  Error: [ 0.11638841 69.71246491] degrees
  Recovered coaltitude: [121.61776733  37.83784526] degrees
  Error: [83.61776733  0.16215474] degrees
  Error: [64.61776733 19.16215474] degrees
import numpy as np
from scipy.signal import fftconvolve
import matplotlib.pyplot as plt
import time
import pyroomacoustics as pra
from pyroomacoustics.doa import circ_dist
from scipy.io import wavfile

######
# We define a meaningful distance measure on the circle

# Location of original source
azimuth_1 = 19.5 / 180.0 * np.pi  # With Respect to x axis
colatitude_1 = 38.0 / 180.0 *np.pi 
azimuth_2 = 90.0 / 180.0 * np.pi  # With Respect to x axis
colatitude_2 = 57.0 / 180.0 *np.pi 
distance = 3.0  # 3 meters

# Make Nearly Equidistant Points on a Spherical Grid for more than one Sound Source searching
#n_grid = pra.doa.grid.GridSphere(n_points=1000, spherical_points=None)
n_grid = 4000
#######################
# algorithms parameters
SNR = 0.0  # signal-to-noise ratio
c = 343.0  # speed of sound
fs = 16000  # sampling frequency - Comment out when importing .wav file
nfft = 256  # FFT size
fmin = 500 # Min Frequency Hz
fmax = 3500 # Max Frequency Hz
freq_bins = np.arange(5, 60)  # FFT bins to use for estimation
num_src = 2 # number of sources
dim = 3 # Dimension of room - '2' for 2D or '3' for 3D

# compute the noise variance
sigma2 = 10 ** (-SNR / 10) / (4.0 * np.pi * distance) ** 2
mic_to_mic_dist = 0.1 # distance from center of mic array to mic
z_offset_factor = 0.9 # Mic Z offset as a factor of z = room_dim[2]


# Define Room Materials

m = pra.make_materials(
    ceiling="hard_surface",
    floor="linoleum_on_concrete",
    east="brickwork",
    west="brickwork",
    north="brickwork",
    south="brickwork",)

# If importing a .wav file as a signal source, use below template. .wav file must
# be in the same folder location as the program
# ** NOTE: fs is extracted from the .wav file here

#fs, source_signal = wavfile.read('guitar_44k.wav')
#aroom.add_source([1.5, 1.7, 1.6], signal=audio_anechoic)

# Create an anechoic room
room_dim = np.r_[10.0, 10.0, 10.0] # Original Line
#aroom = pra.ShoeBox(room_dim, fs=fs, materials = m, max_order = 0, sigma2_awgn=sigma2, air_absorption = True, ray_tracing = True)
aroom = pra.ShoeBox(room_dim, fs=fs, materials = m, max_order = 0, sigma2_awgn=sigma2, air_absorption = False, ray_tracing = False)

# Activate Ray Tracing
#piparoom.set_ray_tracing(receiver_radius=0.5)

# add the source(s)
# Source 1
source_location = room_dim / 2 + distance * np.r_[np.cos(azimuth_1)*np.sin(colatitude_1), np.sin(azimuth_1)*np.sin(colatitude_1), np.cos(colatitude_1)]
source_signal = np.random.randn((nfft // 2 + 1) * nfft)
aroom.add_source(source_location, signal = source_signal)

# Source 2
source_location_2 = room_dim / 2 + distance * np.r_[np.cos(azimuth_2)*np.sin(colatitude_2), np.sin(azimuth_2)*np.sin(colatitude_2), np.cos(colatitude_2)]
source_signal_2 = np.random.randn((nfft // 2 + 1) * nfft)
aroom.add_source(source_location_2, signal = source_signal_2)

#Circular array of 12 mics
#R = np.c_[]

R = np.c_[
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(0 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(0 * np.pi / 6), room_dim[2]/2],  # Mic 1
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(1 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(1 * np.pi / 6), room_dim[2]/2],  # Mic 2
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(2 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(2 * np.pi / 6), room_dim[2]/2],  # Mic 3
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(3 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(3 * np.pi / 6), room_dim[2]/2],  # Mic 4
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(4 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(4 * np.pi / 6), room_dim[2]/2],  # Mic 5
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(5 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(5 * np.pi / 6), room_dim[2]/2],  # Mic 6
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(6 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(6 * np.pi / 6), room_dim[2]/2],  # Mic 7
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(7 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(7 * np.pi / 6), room_dim[2]/2],  # Mic 8
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(8 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(8 * np.pi / 6), room_dim[2]/2],  # Mic 9
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(9 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(9 * np.pi / 6), room_dim[2]/2],  # Mic 10
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(10 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(10 * np.pi / 6), room_dim[2]/2],  # Mic 11
    [room_dim[0]/2 + mic_to_mic_dist*np.cos(11 * np.pi / 6), room_dim[1]/2 + mic_to_mic_dist*np.sin(11 * np.pi / 6), room_dim[2]/2]]  # Mic 12
aroom.add_microphone_array(pra.MicrophoneArray(R, fs=aroom.fs))


# Cubic Mic Array
#R = np.c_[    
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(1 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(1 * np.pi / 4), z_offset_factor*room_dim[2]/2],  # Mic 1
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(2 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(2 * np.pi / 4), z_offset_factor*room_dim[2]/2],  # Mic 2
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(3 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(3 * np.pi / 4), z_offset_factor*room_dim[2]/2],  # Mic 3
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(4 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(4 * np.pi / 4), z_offset_factor*room_dim[2]/2],  # Mic 4
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(1 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(1 * np.pi / 4), room_dim[2]/2],  # Mic 5
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(2 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(2 * np.pi / 4), room_dim[2]/2],  # Mic 6
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(3 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(3 * np.pi / 4), room_dim[2]/2],  # Mic 7
#    [room_dim[0]/2 + mic_to_mic_dist*np.cos(4 * np.pi / 4), room_dim[1]/2 + mic_to_mic_dist*np.sin(4 * np.pi / 4), room_dim[2]/2]]  # Mic 8
#aroom.add_microphone_array(pra.MicrophoneArray(R, fs=aroom.fs))

# Use the following function to compute the rir using either 'ism' method, 'rt' method, or 'hybrid' method

#chrono = time.time()
aroom.compute_rir()
#print("Done in", time.time() - chrono, "seconds.")
#print("RT60:", aroom.measure_rt60()[0, 0, 0])
#aroom.plot_rir()
#plt.show() # Plot RiR

# run the simulation
aroom.simulate()

# Record the audio to a wav file at the microphone
#audio_reverb = aroom.mic_array.to_wav("aaa.wav", norm=True, bitdepth=np.int16)



################################
# Compute the STFT frames needed
X = np.array(
    [
        pra.transform.stft.analysis(signal, nfft, nfft // 2).T
        for signal in aroom.mic_array.signals
    ]
)

##############################################
# Now we can test all the algorithms available
algo_names = sorted(pra.doa.algorithms.keys())

#for algo_name in algo_names:
    # Construct the new DOA object
    # the max_four parameter is necessary for FRIDA only
    
doa = pra.doa.algorithms['NormMUSIC'](R, fs, nfft, num_src = num_src, c=c, max_four=4, n_grid = n_grid, dim = dim)

    # this call here perform localization on the frames in 
doa.locate_sources(X, num_src = num_src, freq_range = [fmin, fmax], freq_bins=freq_bins)

#plot_individual_spectrum() # Plot Steered Specrum for Each Frequency

#doa.polar_plt_dirac()
#plt.title(algo_name)
    # doa.azimuth_recon contains the reconstructed location of the source
#    print(algo_name)
print("  Recovered azimuth:", doa.azimuth_recon / np.pi * 180.0, "degrees")
print("  Error:", circ_dist(azimuth_1, doa.azimuth_recon) / np.pi * 180.0, "degrees")
print("  Error:", circ_dist(azimuth_2, doa.azimuth_recon) / np.pi * 180.0, "degrees")

print("  Recovered coaltitude:", doa.colatitude_recon / np.pi * 180.0, "degrees")
print("  Error:", circ_dist(colatitude_1, doa.colatitude_recon) / np.pi * 180.0, "degrees")
print("  Error:", circ_dist(colatitude_2, doa.colatitude_recon) / np.pi * 180.0, "degrees")

plt.show()
@fakufaku
Copy link
Collaborator

fakufaku commented Jul 6, 2022

It looks like everything is working correctly. There are two things you should pay attention to.

  1. The order you recover the sources in is not guaranteed. This means that when you measure the distance you should consider all permutations of the sources and pick the one minimizing the total error. This can be done using the Hungarian algorithm, which is implement in scipy by the linear_sum_assignment function.
  2. It seems that you are using a planar array, i.e. all the microphones in a plane. In that case, sources placed symmetrically with respect to that plane cannot be told apart by the DOA algorithm. You need to resolve the ambiguity, for example by forcing all recovered source to be in the upper plane.

In your example, the recovered sources are

true az true co est az est co
19.5 38.0 20.28753509 37.83784526
90.0 57.0 89.88361159 121.61776733

I have aligned the groundtruth and estimated values to reduce the error.
You can see that all values are pretty good, except for source 2 colatitude 121.6 != 56.0. However, 121.6 actually close to the symmetric point to 56.0 wrt xy-plane, i.e. 121.6 - 2 * (121.6 - 90) = 58.4 ~ 56.

@johntcuellar
Copy link
Author

johntcuellar commented Oct 11, 2022 via email

@fakufaku
Copy link
Collaborator

fakufaku commented Oct 11, 2022

Hello @johntcuellar ,

  1. Sure. The order of the source locations given by the algorithm is not guaranteed to match that of the groundtruth. A common practice is to measure the error of all pairs (estimated, reference) sources, and choose the ones that minimize the total error.
  2. This is just standard geometry. The collatiude goes from 0 for a source right at the north pole, to 180 degrees for one at the south pole. The colatitude of the source reflected by the x-y plane is given by co_refl = co - 2 * (co - 90).

Hope this clarifies. If not feel free to reply to this!

@johntcuellar
Copy link
Author

johntcuellar commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants