Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incomplete panel data causes segfault and R to abort #56

Open
LiefEsbenshade opened this issue Aug 22, 2021 · 2 comments · May be fixed by #64
Open

Incomplete panel data causes segfault and R to abort #56

LiefEsbenshade opened this issue Aug 22, 2021 · 2 comments · May be fixed by #64

Comments

@LiefEsbenshade
Copy link

We've found that incomplete panel data causes a segfault and R to abort. This is related to #53 where missing values in the panel cause a segfault. Here we have determined that a panel data with missing rows (e.g. the entire row is absent) will also cause a segfault. Included here is a reproducible example of the behavior. I've also added an example of a simple function that could be added to the top of relevant augsynth and multisynth function calls so that an error is returned instead of crashing R.

library(augsynth)
library(dplyr)

kansas_2 <- kansas %>% 
  tidylog::filter(!(fips == 2 & year_qtr == 1995.25)) # drop an arbitrary row creating incomplete panel data


# # This will cause a segfault and R to abort
# syn <- augsynth(lngdpcapita ~ treated, fips, year_qtr, kansas_2,
#                 progfunc = "None", scm = T)

# Example function to check for completeness - could be expanded to check for missing 
# values or other issues. Catching and returning errors will make use much easier 
# than allowing the R session to abort.
check_data <- function(data, unit, time) {
  
  # Check whether there are omitted rows
  full_data <- data %>% tidyr::expand({{unit}}, {{time}})
  
  if(nrow(data) != nrow(full_data)) stop("There are missing rows in the input data set. Panel must be balanced.")
  
}

check_data(kansas, fips, year_qtr) # silent when no issue detected
check_data(kansas_2, fips, year_qtr)
@ebenmichael
Copy link
Owner

This looks great! I'll incorporate it. Or if you'd like, you could add this in to the single_augsynth function, add a test for it in the test_format.R file and make a pull request.

@ebenmichael
Copy link
Owner

Hi sorry for the long delay on this. I'm hoping to get around to this soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants