-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Debug implementation for proj::Proj #72
Conversation
let proj = Proj::new(wgs84).unwrap(); | ||
let debug_string = format!("{:?}", proj); | ||
assert_eq!( | ||
"Proj { id: Some(\"longlat\"), description: Some(\"PROJ-based coordinate operation\"), definition: Some(\"proj=longlat datum=WGS84 no_defs ellps=WGS84 towgs84=0,0,0\"), has_inverse: true, accuracy: -1.0 }", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that if PROJ changes any of the text here in across PROJ versions, this test will fail. Not sure how concerned we should be about that.
@@ -571,8 +607,7 @@ impl Proj { | |||
/// # Safety | |||
/// This method contains unsafe code. | |||
pub fn def(&self) -> Result<String, ProjError> { | |||
let rv = unsafe { proj_pj_info(self.c_proj) }; | |||
_string(rv.definition) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This previous code assumed that definition
field would always be not null in the proj_pj_info
result, which might be the case. But in my testing, proj_pj_info
will sometimes return null for the id
and description
string fields. The PROJ docs doesn't provide clarity on null values for the proj_pj_info
API, so for safety I wrapped all the String
fields in an Option
.
So in this instance, I replaced a potential null pointer dereference with a panic via expect
@@ -124,15 +124,18 @@ impl Area { | |||
} | |||
|
|||
/// Easily get a String from the external library | |||
pub(crate) fn _string(raw_ptr: *const c_char) -> Result<String, ProjError> { | |||
let c_str = unsafe { CStr::from_ptr(raw_ptr) }; | |||
pub(crate) unsafe fn _string(raw_ptr: *const c_char) -> Result<String, ProjError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function takes in a pointer and dereferences it, it's possible for someone (one of us since it's pub(crate)
) to misuse it and reference invalid memory and cause undefined behavior, so we need to mark it as unsafe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The latest force push marks a couple functions in network.rs
as unsafe
for the same reason. They surfaced in CI because _string
gets used within those network functions.
_string(rv) | ||
unsafe { | ||
let rv = proj_errno_string(code); | ||
_string(rv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing my previous comment about _string
safety– because we use proj_errno_string
and _string
in conjunction here, this function is safe. No matter what value of code
we pass to our error_message
function, we'll never reference invalid memory (under the assumption proj_errno_string
always returns a valid string pointer) so we don't need to mark this function as unsafe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense! Would you preserve this rationale in a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
under the assumption
proj_errno_string
always returns a valid string pointer
Lo and behold, proj_errno_string
does not always return a valid string pointer:
https://github.com/OSGeo/PROJ/blob/5e077729274f5d28e137e1a41f7d3350146614ef/src/strerrno.cpp#L12-L14
https://github.com/OSGeo/PROJ/blob/5e077729274f5d28e137e1a41f7d3350146614ef/src/strerrno.cpp#L37-L75
Specifically, it returns a null pointer if code == 0
, so we should handle that here by adding a if rv.is_null()
check. I opened a new GitHub Issue for that since I feel this pull request has already exceeded the initial scope I intended: #73. So prior to this pull request, if one called error_message(0)
, we would dereference a null pointer in _string
. After this pull request, we'll panic because of the assert
I added in _string
.
let id = if pj_info.id.is_null() { | ||
None | ||
} else { | ||
Some(_string(pj_info.id).expect("PROJ built an invalid string")) | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the 1.50.0 release, we could use bool::then
here, but I think it might be a little too soon to avoid breakage for downstream users. We should codify our policy around Rust versions at some point...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh right. rust versions. 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too soon! But yes, we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks helpful!
_string(rv) | ||
unsafe { | ||
let rv = proj_errno_string(code); | ||
_string(rv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes sense! Would you preserve this rationale in a comment?
version: _string(pinfo.version)?, | ||
searchpath: _string(pinfo.searchpath)?, | ||
}) | ||
unsafe { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I've gathered from your other changes in this PR is that you advocate marking the function as unsafe fn
when the caller of the function needs to ensure some invariant is upheld, whereas if all unsafe invariants are intended to be handled within the function, so the caller needn't consider them, you don't mark the function as unsafe fn
.
Is that right? I've never really considered it, but that makes sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you advocate marking the function as
unsafe fn
when the caller of the function needs to ensure some invariant is upheld, whereas if all unsafe invariants are intended to be handled within the function, so the caller needn't consider them, you don't mark the function asunsafe fn
Yeah exactly. And I think this section of TRPL covers that better than I can explain:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm permanently confused about when / when not to mark functions unsafe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When determining whether a function should be marked unsafe
, I ask myself: when a user calls this function, is it possible for the user to provide inputs to the function such that it could result in any of the behaviors Rust considers undefined, which includes:
Dereferencing (using the
*
operator on) a dangling or unaligned raw pointer.
and
...raw pointer read from uninitialized memory, ...
So consider this function (taken directly from the FFI page in the Nomicon):
fn kaboom(ptr: *const i32) -> i32 {
*ptr
}
I'm able to call this function with whatever pointer I want and it will attempt to read from that memory location (e.g. something like: kaboom(0x1a1a as *const i32)
will attempt to read a i32
at memory address 0x1a1a
). So I am able to call this function with a certain input that will cause it to perform undefined behavior (in this case read from uninitialized memory). Thus we need to mark the function as unsafe
.
Coming back to PROJ-land, consider this example:
/// Print the description of a CRS given a definition string
fn crs_description(crs_definition: CString) {
let pj_ptr = proj_sys::proj_create(0, crs_definition.as_ptr());
let pj_proj_info = proj_sys::proj_pj_info(pj_ptr);
println!("Description: {}", _string(pj_proj_info.description));
}
Is it possible to call this function and have it succeed? Yes! Pass it a valid crs_definition
that we know has a description and it should print.
Is it possible to call this function that will result in undefined behavior? Yes! Pass it a crs_definition
PROJ doesn't know about and proj_sys::proj_create
will return a null pointer, and then proj_sys::proj_pj_info
will attempt to deference the null pointer, which is considered undefined. Thus we would need to mark this function as unsafe. And since it's marked as unsafe
we would add a section in the docs for this function # Safety
that describes how you could use this function in a safe way (only passing it valid CRS definitions).
And one more example:
/// Print the description of a CRS given a definition string
fn crs_description(crs_definition: CString) {
let pj_ptr = proj_sys::proj_create(0, crs_definition.as_ptr());
if pj_ptr.is_null() {
return;
}
let pj_proj_info = proj_sys::proj_pj_info(pj_ptr);
if pj_proj_info.description.is_null() {
return;
}
println!("Description: {}", _string(pj_proj_info.description));
}
Is it possible to call this function that will result in undefined behavior? Assuming I'm not overlooking anything, I don't think so! You can pass it whatever string you want and we shouldn't encounter any undefined behavior. Thus we don't need to mark it as unsafe.
Hope that helps a little!
@@ -142,7 +142,7 @@ pub(crate) unsafe extern "C" fn network_open( | |||
} | |||
|
|||
/// Where the ACTUAL work happens, taking advantage of Rust error-handling etc | |||
fn _network_open( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See my comment below regarding the changes in this file: https://github.com/georust/proj/pull/72/files#r574620593
This is great, thanks for catching the potential null pointer deref. Are you happy for this to merge as-is? |
Closes #64