Aleksandar Donev wrote:
> James Giles wrote:
>
>> This part of what you're talking about I've been working
>> at as part of my preprocessed language. I tend to mimic the
>> way virtual memory hardware works. The data (regardless
>> of it's actual size) is allocated in chunks (usually some multiple
>> of the size of a NODE).
> I do the same---I call it a paging system, and use a preprocessor to
> make macros which make for some data-abstraction and hide the layer of
> page indirections. A snip of the involved TYPE definitions is below. It
> is not all perfect though: Each node in my implementation stores which
> page it is on, so it knows it is paged. Can this be avoided, and if so,
> how do I then delete nodes from the pages if I am only passed a node?
In my implementation the page sizes are always a power of
2 multible of the node's size. To the user of the language, the
whole structure look very much like a rank-one array of nodes.
But, when processing the index of a node, the page consists
of the higher order bits and the location within the page is
the low order bits of the index. Since I never use pointers
(I access all nodes based on their index within the "virtual"
array I've created), the node doesn't need to store which page
it on: that's part of the index you used to get to the node.
>> I don't know how you would do this directly in Fortran without
>> exposing the extra level of indirection to view.
> Can it be done in other languages? [...]
I've not seen it, but there are lots of languages I don't know.
> [...] I want to separate the code (not
> preprocessor, but true compiled language) for the data structure from
> the allocator itself, and yet have a relocatable structure. I guess by
> using inlined indexing operators in C++ one can do it, but I would love
> to see an example whose syntax I can actually read and understand how
> it does it. Maybe you can post a little example of your own?
Well, I could show you some source (in my language, not the
generated Fortran). I don't know if that will help. I notice your
example does a lot more that _merely_ implementing an easily
reallocatable object. Or, it seems to. I'd probably have to spend
quite some time to understand it.
The generated Fortran (when the preprocessor hypothetically is
finished) will be something like the following. When and if I have
F2003, I will probably abandon using pointers and switch to
allocatable arrays in my data types instead.
type ptrs ! one of these points to each page
type(node), pointer :: nodes(:)
end type ptrs
! this uses a pointer so that I can reallocate how many pages
! the structure uses
type paged_array
integer :: size
type(ptrs), pointer :: page_heads(:)
end type paged_array
! The page size parameter could be made a field of the above
! paged_array type and it could be a variable set (once! before
! and data is in the structure!) for each separate array is declared
integer, parameter :: page_size = 64
I would have to write a separate set of the above declarations for
each data type I wanted to have paged arrays of.
Now, the rest is code. In outline (since I don't have a finished
preprocessor I can't give you actually generated code), to allocate
or reallocate an array, you decide how many pages the newly
requested size, if necessary you allocate (or reallocate) the
PAGE_HEADS array and then you allocate (or deallocate, if
the new size is smaller) all the pages that are no longer needed.
The SIZE component of the packed_array type would contain
the size of the array actually asked for (not necessarily a multiple
of the page size). The actual amount of space allocated is the
number of non-null pointers times the page size. I could, of
course, just keep that value around instead of needing to compute
it.
References to the paged array would look like array references in
my source language. In the intermediate Fortran they might be
procedure calls or they might be short sequences of statements
or ... I haven't made up my mind yet. Obviously, if I declare:
type (paged_array) :: arr
and subsequently allocated it:
call my_allocate(arr, 500)
In my own language, I want references to the data to look like
x = arr(n)
arr(m:) = a_node
etc.
The first of these could be (assuming N is in range):
x = arr%page_heads(n/64)%nodes(mod(n,64))
Adding a range check would be more difficult. Probably it would
require both references to the array and assignments to it be made
into procedures, elemental to be preferred. Indeed, making the
procedures elemental has the advantage that whole array and array
slice operations can be translated fairly direcly with me having to
analyze them.
I don't know if this helps at all. It's a lot simpler than what you're
doing, but maybe you need some other capability?
As for, say, building a linked list or a directed graph or something:
well I think we all knw the techniques of doing that using array indices
assuming the array is big enough. I now have a means of making an
array that can be efficiently resized so that it can be made big enough
for any contingency. So, such linked structures would be designed
using indices into these arrays instead of using POINTERs at all
(I have no present intention to include POINTERs in my langage).
--
J. Giles
"I conclude that there are two ways of constructing a software
design: One way is to make it so simple that there are obviously
no deficiencies and the other way is to make it so complicated
that there are no obvious deficiencies." -- C. A. R. Hoare
|