9.15.2009

超複雜,aynchronize buffer between user and kernel space

繼續上一個,從 Memory Marshalling in Windows CE 這一篇,有更囉唆的東西..

上一篇提到的是單純的由 Io control 傳 pointer 給 kernel space。
但是這只適用再 iocontrol function 裡。

如果這個 driver 有 另開 thread,而由 io control 傳進來的 pointer 要給這個 thread 使用。
這樣會導致:

iocontrol 已經 return 了,可以 剛剛傳進來的 pointer ,kernel space 的 thread 依然會用到。

這樣因為權限的問題,要呼叫另一組 funciton 來 remap pointer:
來處理這類的 pointer

MDSN 的最後一段話:
If you are running inside the kernel process and you are using an ARM microprocessor with a virtually tagged cache, you can pass MARSHAL_FORCE_ALIAS as the ArgumentDescriptor. On all other CPUs, inside the kernel process, CeAllocAsynchronousBuffer always creates an alias to the memory with VirtualCopy. On ARM CPUs that use a virtually tagged cache inside the kernel process, CeAllocAsynchronousBuffer will creates a duplicate copy of the memory on the heap by default. On large buffers, creating the duplicate heap can have an effect on performance. To prevent duplication, pass the MARSHAL_FORCE_ALIAS flag to cause CeAllocAsynchronousBuffer to create an alias, instead. However, the creation of aliased memory on ARM CPUs that use a virtually tagged cache cause both the source and destination memory to be accessed as uncached, until the alias is destroyed by CeFreeAsynchronousBuffer. This means that memory accesses become slower at both the source and destination.

Do not use the MARSHAL_FORCE_ALIAS flag unless you are using buffers greater than 16 KB.
大概是說,配合 ARM 的 virtual tagged cache,allocasynch 會用 'copy 一份到 kernel heap' 的方式來實作。 但是這樣就會有 overhead - copy 的動作。

所以 buffer 很大的時候,可以用 MARSHAL_FORCE_ALIAS option 來強迫 Ce 用 alias的方式。
但是因為 ARM 的架構,被alias 的 memory 區塊將無法被 cache,所以 access 的效能會比較差 (應該是差很多)。

所以建議 16k 以上的 size 再指定 MARSHARL_FORCE_ALIAS 比較合算。
其實這個值應該要看 cpu 和 memory 速度的比值比較正卻



--好複雜,好複雜..

沒有留言: