Top |
uint32_t | gem_create () |
uint32_t | gem_create_ext () |
void | gem_pool_init () |
void | gem_pool_dump () |
uint32_t | gem_create_from_pool () |
uint32_t gem_create (int fd
,uint64_t size
);
This wraps the GEM_CREATE ioctl, which allocates a new gem buffer object of
size
.
uint32_t gem_create_ext (int fd
,uint64_t size
,uint32_t flags
,struct i915_user_extension *ext
);
This wraps the GEM_CREATE_EXT ioctl, which allocates a new gem buffer object
of size
.
void
gem_pool_init (void
);
Function initializes bo pool (kind of bo cache). Main purpose of it is to support working with softpin to achieve pipelined execution on gpu (without stalls).
For example imagine code as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
uint32_t bb = gem_create(fd, 4096); uint32_t *bbptr = gem_mmap__device_coherent(fd, bb, ...) uint32_t *cmd = bbptr; ... *cmd++ = ...gpu commands... ... *cmd++ = MI_BATCH_BUFFER_END; ... gem_execbuf(fd, execbuf); // bb is part of execbuf <--- first execbuf cmd = bbptr; ... *cmd++ = ... next gpu commands... ... *cmd++ = MI_BATCH_BUFFER_END; ... gem_execbuf(fd, execbuf); // bb is part of execbuf <--- second execbuf |
Above code is prone to gpu hang because when bb was submitted to gpu we immediately started writing to it. If gpu started executing commands from first execbuf we're overwriting it leading to unpredicted behavior (partially execute from first and second commands or we get gpu hang). To avoid this we can sync after first execbuf but we will get stall in execution. For some tests it might be accepted but such "isolated" execution hides bugs (synchronization, cache flushes, etc).
So, to achive pipelined execution we need to use another bb. If we would like to enqueue more work which is serialized we would need more bbs (depends on execution speed). Handling this manually is cumbersome as we need to track all bb and their status (busy or free).
Solution to above is gem pool. It returns first handle of requested size which is not busy (or create a new one if there's none or all of bo are in use). Here's an example how to use it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
uint64_t bbsize = 4096; uint32_t bb = gem_create_from_pool(fd, &bbsize, REGION_SMEM); uint32_t *bbptr = gem_mmap__device_coherent(fd, bb, ...) uint32_t *cmd = bbptr; ... *cmd++ = ...gpu commands... ... *cmd++ = MI_BATCH_BUFFER_END; gem_munmap(bbptr, bbsize); ... gem_execbuf(fd, execbuf); // bb is part of execbuf <--- first execbuf bbsize = 4096; bb = gem_create_from_pool(fd, &bbsize, REGION_SMEM); cmd = bbptr; ... *cmd++ = ... next gpu commands... ... *cmd++ = MI_BATCH_BUFFER_END; gem_munmap(bbptr, bbsize); ... gem_execbuf(fd, execbuf); // bb is part of execbuf <--- second execbuf |
Assuming first execbuf is executed we will get new bb handle when we call
gem_create_from_pool()
. When test completes pool is freed automatically
in igt core (all handles will be closed, memory will be freed and gem pool
will be reinitialized for next test).
Some explanation is needed why we need to put pointer to size instead of passing absolute value. On discrete regarding memory placement (region) object created in the memory can be bigger than requested. Especially when we use allocator to handle vm space and we allocate vma with requested size (which is smaller than bo created) we can overlap with next allocation and get -ENOSPC.
uint32_t gem_create_from_pool (int fd
,uint64_t *size
,uint32_t region
);
Function returns bo handle which is free to use (not busy). Internally it iterates over previously allocated bo and returns first free. If there are no free bo a new one is created.