The delay is entirely caused by clearing the array. I'm using 'bzero()' to do that, as that's the fastest way I know of to zero an existing array, but it still takes 25ms to zero a vector of 16 million shorts. (Stepping through it with a for loop takes roughly 5 times as long.)
Well this is where you could gain some time using the mmap, depending on various factors. When you do a fresh mmap of /dev/zero (or ANON) every time you need a new clear array, that will execute much quicker than clearing all that space, because in fact it is only setting up some page table entries that all point to the same already zeroed memory block. Then, when you start populating it of course you lose some of that advantage as each write into a page causes a page fault and a new memory block being allocated and zeroed and inserted into the page table. Only experimentation can show what the total time of the mmap + page COW operations is when compared to the bzero. It will depend on the density of the routed AMPRnet space.
So, you would change the array[2][2**24] into a *array[2] (2 pointers to 16M entries) and mmap/munmap them every time you need them cleared.
Rob
I've modified the logic to handle deletions and additions to the global array without zeroing it. And global declarations are allocated from zero pages in 'C', so you don't even need to bzero it at program startup.
Deletions zero the corresponding entries in the addrs table, so if the deletion count was non-zero, then when you're through deleting all the expired entries, you run through the subnets table and load the remaining routes into the addr table.
Additions are similar, so when you're through loading the subnet routes table, if there were any additions, you reload the addr table.
About 5 ms total for a full load including encap disk file read time. - Brian
On Tue, May 02, 2017 at 06:03:54PM +0200, Rob Janssen wrote:
(Please trim inclusions from previous messages) _______________________________________________
The delay is entirely caused by clearing the array. I'm using 'bzero()' to do that, as that's the fastest way I know of to zero an existing array, but it still takes 25ms to zero a vector of 16 million shorts. (Stepping through it with a for loop takes roughly 5 times as long.)
Well this is where you could gain some time using the mmap, depending on various factors. When you do a fresh mmap of /dev/zero (or ANON) every time you need a new clear array, that will execute much quicker than clearing all that space, because in fact it is only setting up some page table entries that all point to the same already zeroed memory block. Then, when you start populating it of course you lose some of that advantage as each write into a page causes a page fault and a new memory block being allocated and zeroed and inserted into the page table. Only experimentation can show what the total time of the mmap + page COW operations is when compared to the bzero. It will depend on the density of the routed AMPRnet space.
So, you would change the array[2][2**24] into a *array[2] (2 pointers to 16M entries) and mmap/munmap them every time you need them cleared.
Rob
44Net mailing list 44Net@hamradio.ucsd.edu http://hamradio.ucsd.edu/mailman/listinfo/44net
On Tue, May 2, 2017 at 12:03 PM, Rob Janssen pe1chl@amsat.org wrote:
The delay is entirely caused by clearing the array. I'm using 'bzero()'
to do that, as that's the fastest way I know of to zero an existing array, but it still takes 25ms to zero a vector of 16 million shorts. (Stepping through it with a for loop takes roughly 5 times as long.)
Well this is where you could gain some time using the mmap, depending on various factors. When you do a fresh mmap of /dev/zero (or ANON) every time you need a new clear array, that will execute much quicker than clearing all that space, because in fact it is only setting up some page table entries that all point to the same already zeroed memory block. Then, when you start populating it of course you lose some of that advantage as each write into a page causes a page fault and a new memory block being allocated and zeroed and inserted into the page table.
(Pedantic nit: of course you meant the first write into a new page, not each write.)
Only experimentation can show what the total time of the mmap + page COW
operations is when compared to the bzero. It will depend on the density of the routed AMPRnet space.
So, you would change the array[2][2**24] into a *array[2] (2 pointers to 16M entries) and mmap/munmap them every time you need them cleared.
It's worth noting, too, that with a relatively sparse array will result in the previously calculated amount of virtual address space being allocated to the array, but since much of that will be backed up by the zero page, the actual physical memory usage of the program will be rather lower.
- Dan C.