-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a mechanism to specify "open world" files #738
Comments
Good write-up. I have only a few things to add at this time:
Not always: insertion of a cast at a call site might be blocked by a macro (or, in theory, unwritable code, although that would be a very unusual project setup). I believe 3C handles this correctly now. Soundness: As I think we discussed briefly at Friday's meeting, in an open world, any guarantee of spatial memory safety depends on an assumption that external code interacts with shared declarations in a way consistent with the declarations' Checked C annotations (as well as an assumption that the external code is internally spatially safe, which there's nothing we can do about). This applies to all kinds of program elements: not just undefined functions, but also structs and global variables that could be generated in project code and passed to external code or vice versa. The common case is that the project uses a plain-C external library and needs Checked C declarations for the library that are consistent with its documented behavior, as currently described in the In all of these cases, to give the user a practical way to ensure the project is spatially safe, we need a workflow to ensure that all shared declarations get reviewed. This generalizes what is currently described in #698 for undefined functions. It might be appropriate to broaden #698 to cover review of all kinds of shared declarations and keep this issue for discussion of the remaining topics. Porting phase 2: If we run 3C on the whole project (as is our current practice), we don't actually have unknown callers and definitions; we just want to generate output that is compatible with all callers and definitions as they currently appear in the input, so that the user can copy part of the output into their main branch without introducing errors in the rest of the code. In principle, if the user knows in advance which parts of the code they are going to update, a more direct way to achieve the same end goal would be to mark the rest of the code as unwritable, assuming 3C has an option to do that (I will file an issue for that soon). The use of Assuming the user wants to copy file A (and maybe some header files?), conceivably there might be edge cases in which running 3C with all other files marked unwritable gives slightly better output for A than just running 3C with Since running 3C on the whole project as part of phase 2 is a fundamentally different use scenario than a truly open world, we can expect that it would have slightly different needs that would ideally be accommodated by a separate option, although we can decide whether they are important enough to justify the work of actually maintaining a separate option. A few things I can think of so far:
On the other hand, if we run 3C on only one file at a time (to reduce the running time?), then that file is truly open-world from the point of view of that 3C run, except that we don't have to worry about soundness of declarations shared only with the rest of the project because we ultimately will be checking the entire project against those declarations. |
Allow the user of 3C to specify files and directories as "open world" instead of default "closed world". For closed world files, 3C would continue using its current assumptions. It has a complete program with all dependencies and clients available. Open world files drop this assumption. Dependencies might not be available (manifesting as undefined functions), and there might be more clients than are in the available source code (inferred types must accommodate unchecked callers without casting).
Much of the unique behavior enabled for open world files would be the same as can already be controlled with the
-itypes-for-extern
andinfer-types-for-undefs
flags. It might be possible to redefine them in terms of applying open or closed world behavior to the entire project in order to avoid duplicating similar logic. Alternatively, they could be deprecated in favor of whatever new mechanism is added to specify open world files.Closed world
We assume 3C has access to all callers of all functions, and all uses of all structures, global variables, and typedefs. This matches 3C's current assumptions, so behavior should not change in this mode.
Open World
We cannot assume that 3C can see all function caller, function definitions, etc., so the analysis must be adjusted to permit arbitrary types in the missing code. This will act like some combination of the current
-itypes-for-extern
and-infer-types-for-undefs
flags.-itypes-for-extern
. Undefined functions are handled as they are with the current-infer-types-for-undefs
flag. Since the function definition is not visible, we must conservatively treat the definition as unchecked. Undefined functions will then be internally unsafe, and will be rewritten to use itypes. An open question here is how 3C should go about forcing a function parameter to be an itype. Currently,-itypes-for-extern
, does this by moving checked types into itypes only during rewriting, but this is known to cause invalid rewriting in some cases. Instead, the internal constraint variable for the parameter might be constrained to WILD. This avoids potential Checked C types errors, but limits conversion of local variables inside the function. Similar questions exist for structure fields, global variables, and typedefs.-itypes-for-extern
.-itypes-for-extern
. The workaround seems unsatisfying. If every typedef is unchecked, then every local variable using the typedef would have to remain unchecked as well. Other ideas have been discussed for duplicating typedefs into checked and unchecked versions.Example Use: converting a library header
In the libjpeg tutorial, the
jpeglib.h
header file was copied into a local include directory. 3C was then re-run with-infer-types-for-undefs
to enable solving for and inserting checked types into the local copy of the header file even though the functions in the header were not defined.After the changes proposed here, the flag passed to 3C would then be
-open-world=./include
to specify that files in./include
use the open world assumptions. All other files use the (default) closed world assumptions. When 3C is re-run, the open world assumption allows the undefined functions injpeglib.h
to solve to itypes as before. The changes are:./include
are still unchecked.to_ppm.c
due to local variables using the typedef remaining unchecked.Example Use: Converting a single file in a project
The current approach is to enable
-itypes-for-extern
, convert with 3C, and then keep only the converted header files and the single source file you want to convert. Instead, the whole project could be specified as open world with-open-world=.
. This would make all functions solve to itypes, all typedefs be unchecked, and all structure fields and global variables use itypes instead of checked types. The changes from current behavior are:-itypes-for-extern
is only expected to be enabled in phase two of porting, it might be reasonable to assume that there are no undefined functions outside of any previously specified open world files, since these could be warnings or errors as discussed earlier.The text was updated successfully, but these errors were encountered: