-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential false negative problem in type confinement #10
Comments
Hi, From my understanding, your question is about whether indirect assignment between function and struct fields will be missed or not. More specifically, in your case the type-confinement happens in This is an interesting limitation of MLTA, but it will not introduce FN but introduce some FP instead. In your case, MLTA will treat all the functions that are with type stream_flush_callback_t (or types that can be cast to this type) as potential right values in I cannot build the core project (the install. md of the project seems not updated). Have you checked if MLTA will actually fail to capture the layered information for Hope my answer helps! |
I build dovecot following oss-fuzz. I actually don't run MLTA on the bitcode level. I tried to implement a source-code level MLTA (which of course cannot be as accurate as at the IR level) and found the issue when running it. I understand that when MLTA identifies an escape type, it may downgrade to solve i-call by FLTA. And your answer seems to imply that in this case, MLTA will identify the type of By reviewing the source code of MLTA, I speculate that if However, when |
Thanks for the feedback, as oss-fuzz typically will generate many exe files, could you tell me which specific executable in oss-fuzz/dovecot are you targeting? |
I actually forget that, but I implement 3 simple testcases with the code in dovecot. Although they cannot run, but can generate bitcode for MLTA to analyze. testcase1#define TRUE 1
#define bool _Bool
#define NULL 0
typedef void io_callback_t(void *context);
typedef void iostream_pump_callback_t(enum iostream_pump_status status,
void *context);
typedef int stream_flush_callback_t(void *context);
enum iostream_pump_status {
/* pump succeeded - EOF received from istream and all output was
written successfully to ostream. */
IOSTREAM_PUMP_STATUS_INPUT_EOF,
/* pump failed - istream returned an error */
IOSTREAM_PUMP_STATUS_INPUT_ERROR,
/* pump failed - ostream returned an error */
IOSTREAM_PUMP_STATUS_OUTPUT_ERROR,
};
struct ostream {
/* Number of bytes sent via o_stream_send*() and similar functions.
This is counting the input data. For example with a compressed
ostream this is counting the uncompressed bytes. The compressed
bytes could be counted from the parent ostream's offset.
Seeking to a specified offset only makes sense if there is no
difference between input and output data sizes (e.g. there are no
wrapper ostreams changing the data). */
unsigned int offset;
/* errno for the last operation send/seek operation. cleared before
each call. */
int stream_errno;
/* overflow is set when some of the data given to send()
functions was neither sent nor buffered. It's never unset inside
ostream code. */
bool overflow:1;
/* o_stream_send() writes all the data or returns failure */
bool blocking:1;
bool closed:1;
struct ostream_private *real_stream;
};
struct ostream_private {
void (*set_flush_callback)(struct ostream_private *stream,
stream_flush_callback_t *callback,
void *context);
struct ostream *parent;
stream_flush_callback_t *callback;
void *context;
};
struct iostream_pump {
int refcount;
struct io* io;
struct ostream *output;
iostream_pump_callback_t *callback;
void *context;
bool waiting_output;
bool completed;
};
struct io {
const char *source_filename;
unsigned int source_linenum;
/* trigger I/O callback even if OS doesn't think there is input
pending */
bool pending;
/* This IO event shouldn't be the only thing being waited on, because
it would just result in infinite wait. */
bool never_wait_alone;
io_callback_t *callback;
void *context;
struct ioloop *ioloop;
struct ioloop_context *ctx;
};
void iostream_pump_start(struct iostream_pump *pump)
{
o_stream_set_flush_callback(pump->output,
iostream_pump_flush, pump);
}
void o_stream_set_flush_callback(struct ostream *stream,
stream_flush_callback_t *callback,
void *context)
{
struct ostream_private *_stream = stream->real_stream;
_stream->set_flush_callback(_stream, callback, context);
}
// indirect invoked by _stream->set_flush_callback
static void
o_stream_default_set_flush_callback(struct ostream_private *_stream,
stream_flush_callback_t *callback,
void *context)
{
if (_stream->parent != NULL)
o_stream_set_flush_callback(_stream->parent, callback, context);
_stream->callback = callback;
_stream->context = context;
}
static int iostream_pump_flush(struct iostream_pump *pump)
{
int ret = 0;
return ret;
}
static void stream_send_io(struct ostream_private *stream) {
// indirect-call
stream->callback(stream->context);
} In this case, the assignment of testcase2#define TRUE 1
#define bool _Bool
#define NULL 0
typedef void io_callback_t(void *context);
typedef void iostream_pump_callback_t(enum iostream_pump_status status,
void *context);
typedef int stream_flush_callback_t(void *context);
enum iostream_pump_status {
/* pump succeeded - EOF received from istream and all output was
written successfully to ostream. */
IOSTREAM_PUMP_STATUS_INPUT_EOF,
/* pump failed - istream returned an error */
IOSTREAM_PUMP_STATUS_INPUT_ERROR,
/* pump failed - ostream returned an error */
IOSTREAM_PUMP_STATUS_OUTPUT_ERROR,
};
struct ostream {
/* Number of bytes sent via o_stream_send*() and similar functions.
This is counting the input data. For example with a compressed
ostream this is counting the uncompressed bytes. The compressed
bytes could be counted from the parent ostream's offset.
Seeking to a specified offset only makes sense if there is no
difference between input and output data sizes (e.g. there are no
wrapper ostreams changing the data). */
unsigned int offset;
/* errno for the last operation send/seek operation. cleared before
each call. */
int stream_errno;
/* overflow is set when some of the data given to send()
functions was neither sent nor buffered. It's never unset inside
ostream code. */
bool overflow:1;
/* o_stream_send() writes all the data or returns failure */
bool blocking:1;
bool closed:1;
struct ostream_private *real_stream;
};
struct ostream_private {
void (*set_flush_callback)(struct ostream_private *stream,
stream_flush_callback_t *callback,
void *context);
struct ostream *parent;
stream_flush_callback_t *callback;
void *context;
};
struct iostream_pump {
int refcount;
struct io* io;
struct ostream *output;
iostream_pump_callback_t *callback;
void *context;
bool waiting_output;
bool completed;
};
struct io {
const char *source_filename;
unsigned int source_linenum;
/* trigger I/O callback even if OS doesn't think there is input
pending */
bool pending;
/* This IO event shouldn't be the only thing being waited on, because
it would just result in infinite wait. */
bool never_wait_alone;
io_callback_t *callback;
void *context;
struct ioloop *ioloop;
struct ioloop_context *ctx;
};
static int iostream_pump_flush(void *pump)
{
int ret = 0;
return ret;
}
void o_stream_set_flush_callback(struct ostream *stream,
stream_flush_callback_t *callback,
void *context);
// indirect invoked by _stream->set_flush_callback
static void
o_stream_default_set_flush_callback(struct ostream_private *_stream,
stream_flush_callback_t *callback,
void *context)
{
_stream->callback = callback;
_stream->context = context;
}
void o_stream_set_flush_callback(struct ostream *stream,
stream_flush_callback_t *callback,
void *context)
{
struct ostream_private *_stream = stream->real_stream;
o_stream_default_set_flush_callback(_stream, callback, context);
}
void iostream_pump_start(struct iostream_pump *pump)
{
o_stream_set_flush_callback(pump->output,
iostream_pump_flush, pump);
}
static void stream_send_io(struct ostream_private *stream) {
// indirect-call
stream->callback(stream->context);
}
int main() {
struct iostream_pump *pump;
iostream_pump_start(pump);
stream_send_io(pump->output);
return 0;
} In this case, the assignment of addr-taken function involves two direct callsite. testcase3#define TRUE 1
#define bool _Bool
#define NULL 0
typedef void io_callback_t(void *context);
typedef void iostream_pump_callback_t(enum iostream_pump_status status,
void *context);
typedef int stream_flush_callback_t(void *context);
enum iostream_pump_status {
/* pump succeeded - EOF received from istream and all output was
written successfully to ostream. */
IOSTREAM_PUMP_STATUS_INPUT_EOF,
/* pump failed - istream returned an error */
IOSTREAM_PUMP_STATUS_INPUT_ERROR,
/* pump failed - ostream returned an error */
IOSTREAM_PUMP_STATUS_OUTPUT_ERROR,
};
struct ostream {
/* Number of bytes sent via o_stream_send*() and similar functions.
This is counting the input data. For example with a compressed
ostream this is counting the uncompressed bytes. The compressed
bytes could be counted from the parent ostream's offset.
Seeking to a specified offset only makes sense if there is no
difference between input and output data sizes (e.g. there are no
wrapper ostreams changing the data). */
unsigned int offset;
/* errno for the last operation send/seek operation. cleared before
each call. */
int stream_errno;
/* overflow is set when some of the data given to send()
functions was neither sent nor buffered. It's never unset inside
ostream code. */
bool overflow:1;
/* o_stream_send() writes all the data or returns failure */
bool blocking:1;
bool closed:1;
struct ostream_private *real_stream;
};
struct ostream_private {
void (*set_flush_callback)(struct ostream_private *stream,
stream_flush_callback_t *callback,
void *context);
struct ostream *parent;
stream_flush_callback_t *callback;
void *context;
};
struct iostream_pump {
int refcount;
struct io* io;
struct ostream *output;
iostream_pump_callback_t *callback;
void *context;
bool waiting_output;
bool completed;
};
struct io {
const char *source_filename;
unsigned int source_linenum;
/* trigger I/O callback even if OS doesn't think there is input
pending */
bool pending;
/* This IO event shouldn't be the only thing being waited on, because
it would just result in infinite wait. */
bool never_wait_alone;
io_callback_t *callback;
void *context;
struct ioloop *ioloop;
struct ioloop_context *ctx;
};
static int iostream_pump_flush(void *pump)
{
int ret = 0;
return ret;
}
// indirect invoked by _stream->set_flush_callback
static void
o_stream_default_set_flush_callback(struct ostream_private *_stream,
stream_flush_callback_t *callback,
void *context)
{
_stream->callback = callback;
_stream->context = context;
}
static void stream_send_io(struct ostream_private *stream) {
// indirect-call
stream->callback(stream->context);
}
int main() {
struct ostream *stream;
void* context;
o_stream_default_set_flush_callback(stream, iostream_pump_flush, context);
stream_send_io(stream);
return 0;
} In this case, the assignment of addr-taken function involves only one callsite. resultHowever, I notice in all 3 cases MLTA fail to report |
This also reminds me of an interesting extended question, which is "Can MLTA induce false negatives when it comes to complex data flows?" compared with FLTA. In your paper, you seem to hypothesize that all address-taken functions referred to by structs fields are assigned straightforwardly. However, in the following cases. address-taken function In my understanding, MLTA seems not to conduct such complex data-flow (which may also include inter-procedural complex data-flow). In this case, MLTA should regard void addr_taken_func();
...
func_ptr_t funcs[10];
funcs[5] = addr_taken_func;
...
struct Handler handler;
handler.func = funcs[5]; |
In your crix project. You seem to comment that such a case as: |
Thanks for the description and comments
I think you are mentioning there are no false negatives but more false positives here. I believe MLTA does not involve expensive inter-procedural analysis as it can quickly scanned through the Linux kernel. Involving inter-procedural analysis might increase the overhead. I can double-check the code to make sure whether there are inter-procedural analyses in MLTA or not.
Thanks for the cases! Excited to see MLTA actually misses some cases. I will go through the cases and get back to you when I find something interesting |
I guess the code is in MLTA::typeConfineInFunction(Function *F), where my understanding for this part of code is that they traversing all arguments of a callInst, if encounter a function being used as a argument. Try to deep into the callee to continue type confining. Well, maybe there are some mistakes in my understanding. else if (CallInst *CI = dyn_cast<CallInst>(I)) {
for (User::op_iterator OI = I->op_begin(),
OE = I->op_end();
OI != OE; ++OI) {
if (Function *F = dyn_cast<Function>(*OI)) {
if (F->isIntrinsic())
continue;
if (CI->isIndirectCall()) {
confineTargetFunction(*OI, F);
continue;
}
Value *CV = CI->getCalledOperand();
Function *CF = dyn_cast<Function>(CV);
if (!CF)
continue;
if (CF->isDeclaration())
CF = Ctx->GlobalFuncMap[CF->getGUID()];
if (!CF)
continue;
if (Argument *Arg = getParamByArgNo(CF, OI->getOperandNo())) {
for (auto U : Arg->users()) {
if (isa<StoreInst>(U) || isa<BitCastOperator>(U)) {
confineTargetFunction(U, F);
}
}
}
// TODO: track into the callee to avoid marking the
// function type as a cap
}
}
} |
Hi, author. My recent work requires the use of MLta. I have been reading the code of MLTA and making some adjustments, but there are some parts of the code that I do not fully understand. I would like to seek your guidance on these. 1.Question 1The first place is in MLTA::confineTargetFunction, where you mark a function-type as escaping. However, If I use fuzz type match by comparing the call arguments and function parameters one by one. Do I need to mark the escaped function-type? I give an example typedef void (*fptr_long)(long);
typedef void (*fptr_int)(int);
typedef void (*fptr_ptr)(fptr_long);
void f1(long a) {}
void f2(long a) {}
void f3(long a) {}
void scene1_b(fptr_int f) { f(0); } // call1
void scene1_a () {
fptr_int f = (fptr_int)&f1;
scene1_b(f);
}
struct S {
fptr_long one;
fptr_int two;
};
void scene2_b (struct S* s) { s->one(0); } // call2
void scene2_a () {
struct S s = {&f2 , 0};
scene2_b(&s);
}
fptr_long callback;
void set_callback(fptr_long f) { callback = f; }
void scene3_b () { callback(0); } // call
void scene3_a () {
fptr_ptr some_cb_target = &set_callback ;
some_cb_target(&f3); // call3
}
int main(){
scene1_a();
scene2_a();
scene3_a();
return 0;
} Where in this case, 2.Question 2The following code in MLTA::typeConfineInFunction(Function *F). else if (CallInst *CI = dyn_cast<CallInst>(I)) {
for (User::op_iterator OI = I->op_begin(),
OE = I->op_end();
OI != OE; ++OI) {
if (Function *F = dyn_cast<Function>(*OI)) {
if (F->isIntrinsic())
continue;
if (CI->isIndirectCall()) {
confineTargetFunction(*OI, F);
continue;
}
Value *CV = CI->getCalledOperand();
Function *CF = dyn_cast<Function>(CV);
if (!CF)
continue;
if (CF->isDeclaration())
CF = Ctx->GlobalFuncMap[CF->getGUID()];
if (!CF)
continue;
if (Argument *Arg = getParamByArgNo(CF, OI->getOperandNo())) {
for (auto U : Arg->users()) {
if (isa<StoreInst>(U) || isa<BitCastOperator>(U)) {
confineTargetFunction(U, F);
}
}
}
// TODO: track into the callee to avoid marking the
// function type as a cap
}
}
} I understand the process for |
First, I would like to express my respect for this work. By reviewing the source code of MLTA and some test cases I found a potential case that may cause MLTA to produce additional false negatives than FLTA.
Below is an example from dovecot project. And
iostream_pump_flush
is an address-taken function used as an argument of call expressiono_stream_set_flush_callback(pump->output, iostream_pump_flush, pump);
.I noticed MLTA does process the case that addr-taken function is used as a call argument for type confining. However, in this case, there exists an indirect call in the call in the call-chain,
_stream->set_flush_callback(_stream, callback, context);
callo_stream_default_set_flush_callback
. Where type confinement happens ino_stream_default_set_flush_callback
. Will this lead to insufficient confinement ofiostream_pump_flush
to fieldostream::callbacks
?The text was updated successfully, but these errors were encountered: