Although the documentation indicates that dataspaces are "an abstraction for objects... accessed in a memory mapped fashion", in practical terms they are an abstraction for a virtual memory region. This abstraction provides an interface for control purposes, with the region itself being accessed using conventional memory access operations.
For the region represented by a dataspace, the mapping to physical memory pages and the availability of data is managed by the entity providing the dataspace, this entity being what is informally known as a dataspace manager. When a program attempts to access the dataspace's memory, such access will occur normally if the memory has been made available to the program.
In cases where a dataspace's memory has not yet been made available, the dataspace manager will be contacted to perform the necessary operations to make it available. In effect, the dataspace manager acts as a pager.
The sequence of events following the access to unavailable memory involves:
Although the kernel is involved in handling the page fault and region mappers are involved in propagating it to the pager, the logical path of communication can be regarded as being a notification of the pager when the fault occurs. More details are shown below.
The role of the kernel can be considered as largely transparent, with only various side-effects occurring as a consequence of the interactions between the user space components.
For a task, the assignment of dataspace regions to virtual addresses is done using the region mapper. This determines where in the task's address space the region corresponding to a dataspace will appear. Since a task may be using multiple dataspaces provided by different entities, the region mapper needs to be able to contact the appropriate entity to satisfy accesses to different memory regions.
When page faults occur, the region mapper for a task will be notified in order to identify the dataspace manager responsible for the dataspace providing the address involved. This dataspace manager entity is then notified by the region mapper using a dataspace fault message so that it may arrange for the appropriate page to be made available.
The role of the region mapper becomes more apparent where multiple pagers are involved since it must decide which pager must handle the page fault. Thus, the path of communication involves the region mapper, delivering the page fault notification (as a map request) to the appropriate pager.
The designated pager for a task will receive a map message indicating the details of the memory access causing a page fault. It must respond by granting access to memory available to it. Although a dataspace represents the memory available to a task in a particular region, memory mapping operates at a different level, involving memory pages.
A flexpage is a region of memory defined in terms of a number of memory pages. A flexpage size is constrained to have an "order" (order of magnitude) of base 2, with the resulting size therefore being equal to 2order bytes. The smallest possible order is that yielding the hardware page size which on various platforms is 4096 or 212 bytes. Its size also determines the precision of its position, with this necessarily being a multiple of its size.
Flexpage Sizes | ||
2p (hardware pages) | 2p+1 (2 * page size) | 2p+2 (4 * page size) |
3 | 1 | 0 |
2 | ||
1 | 0 | |
0 |
Hardware pages appear at boundaries every 2p bytes, with double-size flexpages only appearing at boundaries every 2p+1 bytes, and so on. The L4_PAGESIZE constant indicates the size of hardware pages whereas the L4_PAGESHIFT constant indicates the value of p in the above (since 2p is equal to 1 shifted left by p bits).
Types and functions related to flexpages are found here:
The following definitions are of interest when considering what the pager does with map requests:
The base address computed in the pager is distinct from the base in the faulting program, but since it indicates the base of the flexpage to be used to satisfy the memory access, it corresponds to the receive flexpage's base.
The "Flexible-Sized Page-Objects" paper describes different situations to be considered when flexpages are sent to satisfy map requests, involving the relative sizes of the sent and receive flexpages. This paper appears to use the term "fraction" for hot spot.
The objective of the pager is to provide the accessed page within a flexpage of appropriate size. For example, here is an illustration of an access within the page starting at 0x3000, a flexpage offset of 0x3000, and with a hardware page size of 0x1000:
|
|
This leads to consideration of the choice of flexpage size, which itself depends on the structure of the memory available to the pager in the context of the accessed memory. Here, base0 indicates the base of the flexpage that might be considered initially, its size being equal to the receive flexpage. Smaller flexpages employ bases corresponding to their first hardware pages.
The pager should first use the flexpage offset (hot spot) to determine the location of the accessed memory page in any flexpage to be sent.
If a contiguous memory region in the pager provides the entire contents of a dataspace, then the address of the accessed memory page will be the result of adding the base or start address of the dataspace with the flexpage offset:
page address = dataspace start + flexpage offset
However, if the memory to be exposed by the pager only provides a portion of the dataspace, with the memory being populated by the appropriate contents in order to satisfy the map request, then the address of the accessed memory page will need to be constrained by the size of the available memory region:
page address = available start + (flexpage offset % available size)
Note that this requires the available size to be an acceptable flexpage size or for such a quantity to be computed from the available size.
Consider the following access within the page starting at 0x7000, with a flexpage offset of 0x7000, and with a hardware page size of 0x1000, where the available memory is limited to only 0x4000 bytes:
|
|
Consequently, we may refer to the "start" as either the dataspace start (where the entire dataspace is available) or the available memory region start (where such a region exposes the dataspace content), employing a flexpage offset that has been constrained by the available memory region's size using the modulo operator.
With the accessed page identified, the pager then needs to determine the size of the region it may return as a flexpage. For a contiguous region, this involves the following...
Consider the following access within the given memory region of 0x3000 bytes, with an access within the page starting at 0x1000 above the start of the region, with a flexpage offset of 0x1000, and with a hardware page size of 0x1000:
|
|
Here, the region is not large enough for flexpages of 0x4000 bytes (2p+2 in size) because the flexpage would exceed the upper boundary of the region. However, a flexpage of 0x2000 bytes (2p+1 in size) whose base coincides with the region start can be supported, as can a single page of 0x1000 bytes (2p in size).
It can be interesting to consider situations where the flexpage offset does not correspond to the page-aligned offset. Here, an access occurs within the page starting at 0x5000 above the start of the region, but the flexpage offset has a value of 0x1000:
|
|
Here, the base is calculated to be at start + 0x4000, and an appropriate flexpage can be determined as before. The potential reason for the flexpage offset not being 0x5000 may be due to a limitation on the size of the receive flexpage.
Unlike the above where eight pages are available and a base coinciding with start might be envisaged, the receive flexpage might be limited to seven pages. Consequently, the next largest flexpage size (0x4000 or 2p+2) is chosen, situating the base closer to the accessed page.
It is informative to consider alignment issues. In the following, the access is within the page starting at 0x1000 above the start of the region, the flexpage offset has a value of 0x1000, but start is not aligned to 0x4000 (2p+2) flexpage boundaries:
|
|
A flexpage with size 0x1000 (2p) or 0x2000 (2p+1) anchored at start will be appropriate for the receiver since the start is aligned appropriately for these sizes. However, a flexpage of size 0x4000 (2p+2) anchored at start is not appropriate since it is not appropriately aligned, and any attempt to indicate start as the flexpage base will most likely cause a flexpage at start - 0x2000 to be constructed. Upon receipt, this memory will be incorrectly interpreted by the recipient and the page at start - 0x1000 (in the pager) incorrectly placed at the access location.
It would appear that the l4_fpage_max_order function can help with determining the size of the flexpage that can be sent. Since the calculated flexpage base is the page-aligned access address minus the flexpage offset...
base = pagealign(start + offset) - flexpage offset
And since the alignment of the base can be tested to determine which flexpage sizes it may support, the bitwise operation in this function involving an exclusive-or of the page-aligned address and the hotspot effectively performs this subtraction, leaving behind bits from the page-aligned address that indicate the precision required by this base address.
Where the pager does not allocate memory for the entire dataspace being exposed to other tasks, instead maintaining a limited amount of available memory, it will need to invalidate or unmap the pages sent to those tasks as it recycles the available memory for other pages.
If it does not unmap previously sent pages, even though the application will see new data in a page as it first encounters it (due to a page fault occurring), it will ultimately see the data belonging to the most recently mapped page in all accessible previously-encountered pages.
To avoid this, the pager cannot rely on the flexpage sent in response to the map request to somehow automatically invalidate prior flexpages referencing the same memory: such aliasing might be desired in certain situations. Instead, it must explicitly unmap the flexpage for all other tasks but itself, which is possible using the l4_task_unmap function with the L4_FP_OTHER_SPACES flag set.
When exposing memory from a task, conventional memory allocation functions can be used to obtain suitable regions. However, some important considerations apply:
The latter consideration usually only a problem when prototyping pagers: typically, pagers will populate pages that are exposed to other tasks.
A page fault handler can be found here for reference:
Interestingly, an example uses the map operation:
The "kernel interface" appears to expose region mapping operations:
The following files define the interface to dataspaces provided via capabilities:
This interface features calculations for things like the send base and hot spot.
The protocol opcodes are defined in the following file:
There is a default server implementation of the interface:
Relevant flexpage abstractions are defined here:
The following files define the memory allocation interface:
The root server, Moe, provides various dataspace implementations.
The factory support for creating dataspaces is found here:
Useful for flexpage construction is the address method in the following: