-
Notifications
You must be signed in to change notification settings - Fork 830
Description
Affected pages
https://www.php.net/manual/en/function.str-getcsv.php
https://www.php.net/manual/en/function.fgetcsv.php
https://www.php.net/manual/en/splfileobject.fgetcsv.php
Issue description
In str_getcsv, fgetcsv and SplFileObject::fgetcsv the escape character is documented wrong. Specifically, these pages contain the following note:
Note: Usually an enclosure character is escaped inside a field by doubling it; however, the escape character can be used as an alternative. So for the default parameter values "" and " have the same meaning. Other than allowing to escape the enclosure character the escape character has no special meaning; it isn't even meant to escape itself.
However, these functions turn "" into " while \" remains \". Therefore "" and \" do not have the same meaning.
The Note should be changed to reflect this difference.
Steps to reproduce
Run the following PHP code and observe the output:
$a = '"foo""bar"';
$b = '"foo\"bar"';
$aCsv = str_getcsv($a);
$bCsv = str_getcsv($b);
var_dump($aCsv);
var_dump($bCsv);
var_dump($aCsv == $bCsv);Output on PHP 8.4:
Deprecated: str_getcsv(): the $escape parameter must be provided as its default value will change in php shell code on line 3
Deprecated: str_getcsv(): the $escape parameter must be provided as its default value will change in php shell code on line 4
array(1) {
[0]=>
string(7) "foo"bar"
}
array(1) {
[0]=>
string(8) "foo\"bar"
}
bool(false)
Output on PHP 8.0:
array(1) {
[0]=>
string(7) "foo"bar"
}
array(1) {
[0]=>
string(8) "foo\"bar"
}
bool(false)
Output on PHP 7.4:
array(1) {
[0]=>
string(7) "foo"bar"
}
array(1) {
[0]=>
string(8) "foo\"bar"
}
bool(false)
Suggested fix
Because of how confusing this behavior is I think a warning rather than a note is appropriate, here's my suggestion:
Warning: Inside an enclosure, the enclosure character can always be escaped by doubling it, resulting in a single enclosure character in the parsed result. The escape character works differently: If it is followed by an enclosure character then that enclosure character will not be treated as one, however the escape character itself remains. So for the default parameters,
""inside an enclosure will be parsed into", while\"inside an enclosure will be parsed into\".