-
Notifications
You must be signed in to change notification settings - Fork 765
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panda's read_csv gives an error: Type of "read_csv" is partially unknown #230
Comments
Experimenting a bit more, this happens other panda's function: .isin():
.dropna():
|
The "partially unknown" error should appear only if you enable "strict" type checking. In this mode, Pylance will report any cases where types are unknown or partially unknown. In this case, the pandas type stub has incomplete information. It is not providing type arguments in some cases (e.g. it uses "Iterable" rather than "Iterable[Type]"). The pandas type stubs are still under development and have known holes like this. Until they are complete, you will not be able to use them with "strict" type checking. |
These stubs will be much, much improved in the next release as well. |
Though there is one thing here that I think might be unintended; we're showing an error related to the definition of a stub in a user file. If the stub is "wrong", I'm not sure that this error is actionable from the use side of the call. |
Yes, I enabled strict mode. I suspected it was an issue with the stub files, but as @jakebailey says, this shouldn't show as an error since the problem is not under the user's control. Even in strict mode, it should probably just show a warning stating that there is a problem with the stubs (and that it isn't something the user can do something about). |
The whole purpose of strict mode is to inform the user that there is a "hole" in type checking. If you want to adjust the diagnostic severities for specific rules (e.g. change an error into a warning), you can configure it as such. |
Fair enough, I get your point. Feel free to close this. Although I still think it would be nice to make it clear to the user that the problem stems from the stubs, and not from his own code. |
That can turn out to be difficult, unfortunately. For example, if a library has declared a generic, and the user code misuses it and forgets to specify one of the types in their annotations, then from the type checker's point of view it's not much different than the above case because there's something in the current code that's unknown. I'm not entirely certain it's possible to know for certain who is "at fault" when the unknown happens... |
@ldorigo Is this still the case in recent releases? Our pandas stubs have been improved a number of times since this issue was created. |
This issue has been waiting for a follow up for 30 days. Because we haven't heard back, we'll be closing this ticket. Feel free to reach out if this is still a problem! |
Am I the only one who still has this issue? Is my type set up not working or will this error still occure in strict mode? What is the workarround if I would like to use pandas in a strict environment? |
@Donnerstagnacht, below is the signature of (function) def read_csv(
filepath_or_buffer: FilePath | ReadCsvBuffer[bytes] | ReadCsvBuffer[str],
*,
sep: str | None = ...,
delimiter: str | None = ...,
header: int | Sequence[int] | Literal['infer'] | None = ...,
names: ListLikeHashable[Unknown] | None = ...,
index_col: int | str | Sequence[str | int] | Literal[False] | None = ...,
usecols: UsecolsArgType[Unknown] = ...,
dtype: DtypeArg | defaultdict[Unknown, Unknown] | None = ...,
engine: CSVEngine | None = ...,
converters: Mapping[int | str, (str) -> Any] | Mapping[int, (str) -> Any] | Mapping[str, (str) -> Any] | None = ...,
true_values: list[str] = ...,
false_values: list[str] = ...,
skipinitialspace: bool = ...,
skiprows: int | Sequence[int] | ((int) -> bool) = ...,
skipfooter: int = ...,
nrows: int | None = ...,
na_values: Sequence[str] | Mapping[str, Sequence[str]] = ...,
keep_default_na: bool = ...,
na_filter: bool = ...,
verbose: bool = ...,
skip_blank_lines: bool = ...,
parse_dates: bool | list[int] | list[str] | Sequence[Sequence[int]] | Mapping[str, Sequence[int | str]] = ...,
infer_datetime_format: bool = ...,
keep_date_col: bool = ...,
date_format: dict[Hashable, str] | str | None = ...,
dayfirst: bool = ...,
cache_dates: bool = ...,
iterator: Literal[False] = ...,
chunksize: None = ...,
compression: CompressionOptions = ...,
thousands: str | None = ...,
decimal: str = ...,
lineterminator: str | None = ...,
quotechar: str = ...,
quoting: CSVQuoting = ...,
doublequote: bool = ...,
escapechar: str | None = ...,
comment: str | None = ...,
encoding: str | None = ...,
encoding_errors: str | None = ...,
dialect: str | Dialect = ...,
on_bad_lines: ((list[str]) -> (list[str] | None)) | Literal['error', 'warn', 'skip'] = ...,
delim_whitespace: bool = ...,
low_memory: bool = ...,
memory_map: bool = ...,
float_precision: Literal['high', 'legacy', 'round_trip'] | None = ...,
storage_options: StorageOptions = ...,
dtype_backend: DtypeBackend | Literal[_NoDefault.no_default] = ...
) -> DataFrame |
Environment data
Expected behaviour
There shouldn't be an error whenever read_csv is used?
Actual behaviour
Whenever I use
pd.read_csv()
, pylance shows the following error:Logs
Don't think logs are necessary, if so I will be happy to add them.
Code Snippet / Additional information
The text was updated successfully, but these errors were encountered: