Skip to content

Groups config variables in struct to emulate a namespace#62

Merged
kent-mcleod merged 1 commit intoseL4:masterfrom
Hensoldt-Cyber:patch-axel-9
Mar 29, 2023
Merged

Groups config variables in struct to emulate a namespace#62
kent-mcleod merged 1 commit intoseL4:masterfrom
Hensoldt-Cyber:patch-axel-9

Conversation

@axel-h
Copy link
Member

@axel-h axel-h commented Mar 8, 2023

  • Groups config variables in struct to emulate a namespace, this avoids polluting the global name space.
  • Pass config down to make the code a bit more generic
  • Add a helper function to improve code readability a bit

@axel-h axel-h force-pushed the patch-axel-9 branch 5 times, most recently from 085e3de to 091dc8b Compare March 10, 2023 11:17
@axel-h axel-h force-pushed the patch-axel-9 branch 2 times, most recently from aef3416 to ab41262 Compare March 10, 2023 20:54
.ram = {
.phys_base = /*? vm_address_config.get('ram_paddr_base') ?*/,
.base = /*? vm_address_config.get('ram_base') ?*/,
.size = /*? vm_address_config.get('ram_size') ?*/,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hlyytine: Do I understand you comment #68 (comment) right that you have code that needs ram_size. In addition to this struct, we can keep const unsigned long ram_size explicitly here with a comment about the compatibility. My intention here is that that this explicitly describes the fixed ordinary RAM the VMs gets. There might be additional RAM, but this should be described somewhere else then - or we have add more documentation how this is supposed to work.

Copy link
Contributor

@hlyytine hlyytine Mar 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hlyytine: Do I understand you comment #68 (comment) right that you have code that needs ram_size.

Yes, that is correct. We use that to determine whether mapping requests are inside guest RAM. In that case we map 2 MB large pages. I guess that could be upstreamed some day, though.

My intention here is that that this explicitly describes the fixed ordinary RAM the VMs gets. There might be additional RAM, but this should be described somewhere else then

What do you mean by "ordinary" RAM?

From guest perspective, guest does not even know whether it's physical memory is stage-2 mapped to frames in the way CAmkES-VM does it or if the frames come from a CAmkES dataport -- and because VMM accesses guest memory (to load images and DTB there, among other purposes) through vm_ram_touch() to access guest's vspace, VMM does not have any possibility to tell the difference between "ordinary" RAM and something other. To both the guest and the VMM (excluding this mapping code) they are all the same, as ordinary as it gets.

To me it seems the definition of "ordinary" here boils down whether it is something that is done in a traditional manner or in some other way, regardless the fact that the guest and the VMM cannot tell the difference. And if we have ability to say that a VMM module that provides mappings worth of ram_size bytes at guest physical address ram_base is required, we could not care less about how it is done. The only exception is this new map_one_to_one which requires that guest physical addresses are the same as real physical addresses. But I am confident we can come up with clever solution.

or we have add more documentation how this is supposed to work.

Personally I prefer improving code even if it requires minor architectural changes and documentation, so I would vote this.

@hlyytine
Copy link
Contributor

By the way, I like this. I was thinking since vm_t * gets passed almost everywhere already, how about if you add these to libsel4vm/include/sel4vm/guest_vm.h:

typedef struct vm_config vm_config_t;

struct vm {
   ...
   const struct vm_config *config:
   ...
};

Then in VM_Arm/src/main.c at the start of main_continued() just add

    vm.config = &vm_config;

By using that forward declaration of typedef struct, we don't bind libsel4vm to CAmkES-VM. Yet the VM's config is accessible in the most logical place.

@axel-h axel-h force-pushed the patch-axel-9 branch 4 times, most recently from 3ec2dab to e140ab9 Compare March 20, 2023 14:03
@axel-h
Copy link
Member Author

axel-h commented Mar 20, 2023

By the way, I like this. I was thinking since vm_t * gets passed almost everywhere already [...]

Yes, this is also what I was considering once this PR is accepted. It would make things easier to have something there. I'm not sure how much dependencies the vm_t has, so an interim solution could be something specific for the VMM then: ```

typedef struct {
    vm_t *vm
    const struct vm_config *config;
    ... other VMM globals here ...
} vmm_context_t;

@axel-h axel-h force-pushed the patch-axel-9 branch 2 times, most recently from 88f0b3f to 72e8aa3 Compare March 20, 2023 16:02
@kent-mcleod
Copy link
Member

Yes, this is also what I was considering once this PR is accepted. It would make things easier to have something there. I'm not sure how much dependencies the vm_t has, so an interim solution could be something specific for the VMM then

Isn't this reversing the dependency ordering? What are some examples of functions that are only given a vm_t but would need to use vm_config? (I'd assume all libsel4vm callbacks already take a void* cookie that would be used for passing something like a vm_config object through)

@axel-h axel-h force-pushed the patch-axel-9 branch 2 times, most recently from c684e74 to 6548f6a Compare March 20, 2023 22:54
@axel-h
Copy link
Member Author

axel-h commented Mar 20, 2023

Isn't this reversing the dependency ordering?

Yes, I fear this might create some issues. I was considering giving vm_t avoid *vmm field, because this seems to come handy sometimes to avoid depending in global variables in callbacks that seem to lack a context parameter. Down this rabbit hole, It seems that even adding the entry field to the structure was not really necessary. This seems CAmkES VMM specific management data only, it can be removed with some more cleanup in the libs.... Any, way this is a follow up step that I'd prefer to keep out of this PR.

@hlyytine
Copy link
Contributor

What do you think the particular problem with config field in libsel4vm's struct vm would be? Let us consider this code:

/* libsel4vm */
typedef struct vm_config vm_config_t;
typedef struct vm {
    const vm_config_t *config;
} vm_t;

const vm_config_t *try_this(vm_t *vm)
{
    return vm->config;
}

This is valid C and libsel4vm does not need to know how struct vm_config is eventually defined in a VMM that uses the library. So this does not create dependency to CAmkES VM. In fact it does not force us to define struct vm_config at all. libsel4vm, if it ever needs to handle that config field, can operate on it without knowing the contents, because it is just a pointer. But CAmkES VM (or any other VM) can access config directly from struct vm. Sure you could just use void *, but this forward typedef struct declaration forces the VM side (user of this library) to use proper type for all config-related things, not just typecasting void * pointers.

- Avoid polluting the global name space and group things in a struct
- Pass config down to make the code more modular

Signed-off-by: Axel Heider <axel.heider@hensoldt.net>
@axel-h
Copy link
Member Author

axel-h commented Mar 21, 2023

The change here use to depend on #71 adding a helper function. Since the discussion there is still ongoing, I've remove this dependency now, so it does not become a unnecessary blocker here.

@axel-h
Copy link
Member Author

axel-h commented Mar 21, 2023

What do you think the particular problem with config field in libsel4vm's struct vm would be?

This should would work and some type safety is nice. It's quite weak, but still better than nothing. I'm just unsure if we should put a vm_config_t * or vmm_context_t * there, or maybe both. I don't really want to create the impression that the vm_config_t from the CAmkES VMM template must be the vm_config_t that vm_t gets. It's still just coincidence that the names are the same. That make me wonder if putting a general vmm_context_t * or a void *cookie would be better, as this it completely agnostic then and does not make developer imply something.

@hlyytine
Copy link
Contributor

hlyytine commented Mar 21, 2023

That make me wonder if putting a general vmm_context_t * or a void *cookie would be better, as this it completely agnostic then and does not make developer imply something.

OK, I missed completely your point, sorry. How about still one more alternative, container_of() as made famous by Linux kernel. Like this:

typedef struct camkes_vm {
   const struct camkes_vm_config *config;
   vm_t vm; /* this one is unmodified from libsel4vm */
} camkes_vm_t;

Example how to use it:

memory_fault_result_t our_fault_callback(vm_t *vm, vm_vcpu_t *vcpu, uintptr_t fault_addr, size fault_length, void *cookie)
{
     /* 1st parameter  is a pointer to the field named in 3rd parameter in a struct named by 2nd parameter */
    camkes_vm_t *camkes_vm = container_of(vm, camkes_vm_t, vm);
    ...
} 

The other way:

vm_memory_reservation_t *example_reserve_at(camkes_vm_t *camkes_vm, uintptr_t start, size_t bytes, 
    memory_fault_result_t (*fault_callback)(vm_t *, vm_vcpu_t *, uintptr_t, size_t, void *))
{
    /* this is from libsel4vm */
    return vm_reserve_memory_at(&camkes_vm->vm, start, bytes, our_fault_callback);
}
  • No need to modify libsel4vm at all
  • CAmkES VM code modification should be search & replace in the editor
  • No casting to/from void *
  • Doesn't force/suggest anything related to naming etc.

@axel-h
Copy link
Member Author

axel-h commented Mar 21, 2023

How about still one more alternative, container_of()

While container_of() is a quite cool thing in some cases, I am a bit uneasy about it making the assumption a give object is inside another bigger object. That is also fragile and can be a pain to debug this the assumption does not hold. What I like about a generic reference slot is, that it is simple and quite explicit. Using container_of() allows hiding the back-link, but I don't think we have a strong need to do this. Also, more complex reference mechanisms can be build on top or a simple reference slot. And I'm also not too worried about the casting in this case.

But what about this PR now, can we agree to merge this?

@kent-mcleod
Copy link
Member

I'd be happy with a container_of() style approach over adding a back-reference field to the inner struct, but I'd still like to see examples of callbacks that don't have a cookie reference arg.

I also think that this PR can be accepted for now with just adding the namespacing.

char const *dtb_base;
} files;

char const *kernel_bootcmdline;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there some specific reason to write it this way and not const char * like elsewhere in the code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const char * is the sloppy way to write it, char const * is more correct actually when it comes to the order.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really so? I know the difference between char const * and char * const, the former being a non-constant pointer to constant character and the latter being a constant pointer to non-constant character. In const char * const both the pointer and the character pointed to would be constant. I also know const char * and char const * are equivalent, but that the latter would be more correct is new to me.

Also if we do simple grep|wc for both the seL4 kernel and CAmkES projects, there are roughly 50 occurrences of const char * per one occurrence of char const *. Should we fix those some day then? And how about other types, should we rewrite const unsigned long * as unsigned long const *? Does it apply to pointers only, or should we fix const unsigned long ram_base to unsigned long const ram_base?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that really so?

const binds to the left. Unless it is the first thing, then it binds to the right. That is why you can define a constant with const int foo = 42, which looks a bit nicer than int const foo = 42.
For an instance vm_config_t, the member don't have to be constants. It can be be used and changed during runtime also to manage a configuration. That why I avoid using a const in the beginning. Just for the instance we create from the template, it happens that all data is known at compile time, so whole instance is define as a const to allows it to be put in ro-data.

... seL4 kernel and CAmkES projects ... Should we fix those some day then?
In some places it could make sense, e.g. for function parameter types or struct members. When constants are declared, it I'd keep it stating with const.

should we rewrite const unsigned long * as unsigned long const *
In my opinion, yes. For every pointer it makes sense.

Does it apply to pointers only, or should we fix const unsigned long ram_base to unsigned long const ram_base?
For declaring integers declaring a constant int const foo = 42 looks really odd, so I would not change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, good to learn about your preferences. Perhaps if I toyed with the idea a little bit longer, I might even like it. But that's beside the point. The real reason why I questioned the style choice is, at the risk of repeating myself, shouldn't we follow the existing style of whatever project we are contributing to? For example, the pattern if (rvalue == lvalue) has its merits and those are hard to argue against -- IMO there's only one real reason: for those unfamiliar with the pattern it distracts the attention from the focus point. And then there are sensitive ones, like me, who spot style deviations at a first glance (of course I do accidentally those as well, but that's another story). I can speak only for myself, but I don't do that in order to be able to nag and harass people, but it really throws me off. That's why I suggested perhaps we should stick to existing style, even if it's not the best of the best.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hlyytine, your principles about how we try and follow a consistent style are aligned with mine.

We do have a style guide: https://docs.sel4.systems/processes/style-guide.html and the default assumption is that people will try and follow existing conventions in a file.

const char * is the sloppy way to write it, char const * is more correct actually when it comes to the order.

How can one approach be more correct over another when they have the same meaning? The standard even uses const int *ptr_to_constant; as the first example of a non-constant pointer to a constant value.

I don't have much spare time to police style on PRs, so I usually make one of two choices:

  • Ignore and approve, which reduces the overall level of consistency,
  • Engage in the debate and then have less time for reviewing/working on other PRs which then take longer to resolve.

At the end of the day, this is an open source project and the style should serve us towards making it easier to collaborate and maintain, not the other way round. With collaborators from all over the place, it's not really easy to have a tight consistent style without being prepared to rewrite everyone's PRs ourselves.

We have a tool that that both checks style and auto-formats invalid style. Anything that the tool accepts is acceptable to merge. The tool roughly tries to implement the style guide I linked above. Any sufficiently strong opinions for changes in acceptable style can either propose an update to the tool or the style-guide, but style rules must be able to be auto-enforced and applied by the tool for us to consider them.

@hlyytine
Copy link
Contributor

While container_of() is a quite cool thing in some cases, I am a bit uneasy about it making the assumption a give object is inside another bigger object. That is also fragile and can be a pain to debug this the assumption does not hold. What I like about a generic reference slot is, that it is simple and quite explicit. Using container_of() allows hiding the back-link, but I don't think we have a strong need to do this. Also, more complex reference mechanisms can be build on top or a simple reference slot. And I'm also not too worried about the casting in this case.

If we ever find that cookie references @kent-mcleod mentioned are not enough and we need to return to this topic, please provide some more concrete examples. Right now I just learned about your opinions, but not the reasoning behind them.

But what about this PR now, can we agree to merge this?

OK otherwise, but please check my question above about char const * vs const char *. To me it looks unnecessary style deviation from existing code base.

@hlyytine
Copy link
Contributor

hlyytine commented Mar 24, 2023

Despite my nagging about the style, there's nothing wrong with this; actually quite the opposite. The only thing I miss is being able to access that vm_t vm (which is a local variable in main.c) in modules. But that's out of scope for this PR.

Edit: I don't know what kind of short circuit occurred in my head -- you get that very VM as a parameter to module init function.

@kent-mcleod kent-mcleod merged commit 2e373d2 into seL4:master Mar 29, 2023
@axel-h axel-h deleted the patch-axel-9 branch March 30, 2023 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants