GEM Create: igt-gpu-tools Reference Manual

GEM Create

GEM Create — Helpers for dealing with objects creation

Functions

uint32_t	gem_create ()
uint32_t	gem_create_ext ()
void	gem_pool_init ()
void	gem_pool_dump ()
uint32_t	gem_create_from_pool ()

Description

This helper library contains functions used for handling creating gem objects.

Functions

gem_create ()

uint32_t
gem_create (int fd,
            uint64_t size);

This wraps the GEM_CREATE ioctl, which allocates a new gem buffer object of size .

Parameters

fd	open i915 drm file descriptor
size	desired size of the buffer

Returns

The file-private handle of the created buffer object

gem_create_ext ()

uint32_t
gem_create_ext (int fd,
                uint64_t size,
                uint32_t flags,
                struct i915_user_extension *ext);

This wraps the GEM_CREATE_EXT ioctl, which allocates a new gem buffer object of size .

Parameters

fd	open i915 drm file descriptor
size	desired size of the buffer
flags	optional flags
ext	optional extensions chain

Returns

The file-private handle of the created buffer object

gem_pool_init ()

void
gem_pool_init (void);

Function initializes bo pool (kind of bo cache). Main purpose of it is to support working with softpin to achieve pipelined execution on gpu (without stalls).

For example imagine code as follows:

uint32_t bb = gem_create(fd, 4096);
uint32_t *bbptr = gem_mmap__device_coherent(fd, bb, ...)
uint32_t *cmd = bbptr;
...
*cmd++ = ...gpu commands...
...
*cmd++ = MI_BATCH_BUFFER_END;
...
gem_execbuf(fd, execbuf); // bb is part of execbuf   <--- first execbuf

cmd = bbptr;
...
*cmd++ = ... next gpu commands...
...
*cmd++ = MI_BATCH_BUFFER_END;
...
gem_execbuf(fd, execbuf); // bb is part of execbuf   <--- second execbuf

Above code is prone to gpu hang because when bb was submitted to gpu we immediately started writing to it. If gpu started executing commands from first execbuf we're overwriting it leading to unpredicted behavior (partially execute from first and second commands or we get gpu hang). To avoid this we can sync after first execbuf but we will get stall in execution. For some tests it might be accepted but such "isolated" execution hides bugs (synchronization, cache flushes, etc).

So, to achive pipelined execution we need to use another bb. If we would like to enqueue more work which is serialized we would need more bbs (depends on execution speed). Handling this manually is cumbersome as we need to track all bb and their status (busy or free).

Solution to above is gem pool. It returns first handle of requested size which is not busy (or create a new one if there's none or all of bo are in use). Here's an example how to use it:

uint64_t bbsize = 4096;
uint32_t bb = gem_create_from_pool(fd, &bbsize, REGION_SMEM);
uint32_t *bbptr = gem_mmap__device_coherent(fd, bb, ...)
uint32_t *cmd = bbptr;
...
*cmd++ = ...gpu commands...
...
*cmd++ = MI_BATCH_BUFFER_END;
gem_munmap(bbptr, bbsize);
...
gem_execbuf(fd, execbuf); // bb is part of execbuf   <--- first execbuf

bbsize = 4096;
bb = gem_create_from_pool(fd, &bbsize, REGION_SMEM);
cmd = bbptr;
...
*cmd++ = ... next gpu commands...
...
*cmd++ = MI_BATCH_BUFFER_END;
gem_munmap(bbptr, bbsize);
...
gem_execbuf(fd, execbuf); // bb is part of execbuf   <--- second execbuf

Assuming first execbuf is executed we will get new bb handle when we call gem_create_from_pool(). When test completes pool is freed automatically in igt core (all handles will be closed, memory will be freed and gem pool will be reinitialized for next test).

Some explanation is needed why we need to put pointer to size instead of passing absolute value. On discrete regarding memory placement (region) object created in the memory can be bigger than requested. Especially when we use allocator to handle vm space and we allocate vma with requested size (which is smaller than bo created) we can overlap with next allocation and get -ENOSPC.

gem_pool_dump ()

void
gem_pool_dump (void);

gem_create_from_pool ()

uint32_t
gem_create_from_pool (int fd,
                      uint64_t *size,
                      uint32_t region);

Function returns bo handle which is free to use (not busy). Internally it iterates over previously allocated bo and returns first free. If there are no free bo a new one is created.

Parameters

fd	open i915 drm file descriptor
size	pointer to size, on input it points to requested bo size, on output created bo size will be stored there
region	region in which bo should be created

Returns

bo handle + created bo size (via pointer to size)