Skip to content

Conversation

@sbearrows
Copy link
Contributor

Closes #295

We want to track skipped lines so that problems() reporting is more accurate. Currently, if commented lines, empty lines or skipped lines from skip = ... are present, problems() is inaccuarte.

Also, problems() currently reports row and column numbers for the original file. This PR adds a new column called line that replaces row to make it more apparent that this column represents the line from the original file. Now, row is used to represent the row in the data frame

delim_input <- glue::glue("#Name:,Sharla
                          #Date:,02/01/22
                          x,y
                          1,1
                          2,2.x")

output <- vroom(I(delim_input),
  col_types = "dd", comment = "#", altrep = FALSE
)
#> Warning: One or more parsing issues, call `problems()` on your data frame
#> for details, e.g.:
#>   dat <- vroom(...)
#>   problems(dat)

output
#> # A tibble: 2 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     1
#> 2     2    NA

problems(output)
#> # A tibble: 1 × 6
#>    line   row   col expected actual file                         
#>   <int> <int> <int> <chr>    <chr>  <chr>                        
#> 1     5     2     2 a double 2.x    /private/var/folders/4g/9jcx…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

line number in problems not correct after commented rows.

2 participants