{
	"id": "2fc93ca7-1675-4548-9ece-25ca9c6e5c75",
	"created_at": "2026-04-06T00:09:03.550767Z",
	"updated_at": "2026-04-10T13:12:25.567745Z",
	"deleted_at": null,
	"sha1_hash": "77aa980d7519578731f35e851fd8fd15739e4271",
	"title": "The Linux Kernel API — The Linux Kernel documentation",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 1525782,
	"plain_text": "The Linux Kernel API — The Linux Kernel documentation\r\nArchived: 2026-04-02 10:45:17 UTC\r\nData Types¶\r\nDoubly Linked Lists¶\r\nvoid list_add (struct list_head * new, struct list_head * head)¶\r\nadd a new entry\r\nParameters\r\nstruct list_head * new\r\nnew entry to be added\r\nstruct list_head * head\r\nlist head to add it after\r\nDescription\r\nInsert a new entry after the specified head. This is good for implementing stacks.\r\nvoid list_add_tail (struct list_head * new, struct list_head * head)¶\r\nadd a new entry\r\nParameters\r\nstruct list_head * new\r\nnew entry to be added\r\nstruct list_head * head\r\nlist head to add it before\r\nDescription\r\nInsert a new entry before the specified head. This is useful for implementing queues.\r\nvoid __list_del_entry (struct list_head * entry)¶\r\ndeletes entry from list.\r\nParameters\r\nstruct list_head * entry\r\nthe element to delete from the list.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 1 of 228\n\nNote\r\nlist_empty() on entry does not return true after this, the entry is in an undefined state.\r\nvoid list_replace (struct list_head * old, struct list_head * new)¶\r\nreplace old entry by new one\r\nParameters\r\nstruct list_head * old\r\nthe element to be replaced\r\nstruct list_head * new\r\nthe new element to insert\r\nDescription\r\nIf old was empty, it will be overwritten.\r\nvoid list_del_init (struct list_head * entry)¶\r\ndeletes entry from list and reinitialize it.\r\nParameters\r\nstruct list_head * entry\r\nthe element to delete from the list.\r\nvoid list_move (struct list_head * list, struct list_head * head)¶\r\ndelete from one list and add as another’s head\r\nParameters\r\nstruct list_head * list\r\nthe entry to move\r\nstruct list_head * head\r\nthe head that will precede our entry\r\nvoid list_move_tail (struct list_head * list, struct list_head * head)¶\r\ndelete from one list and add as another’s tail\r\nParameters\r\nstruct list_head * list\r\nthe entry to move\r\nstruct list_head * head\r\nthe head that will follow our entry\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 2 of 228\n\nint list_is_last (const struct list_head * list, const struct list_head * head)¶\r\ntests whether list is the last entry in list head\r\nParameters\r\nconst struct list_head * list\r\nthe entry to test\r\nconst struct list_head * head\r\nthe head of the list\r\nint list_empty (const struct list_head * head)¶\r\ntests whether a list is empty\r\nParameters\r\nconst struct list_head * head\r\nthe list to test.\r\nint list_empty_careful (const struct list_head * head)¶\r\ntests whether a list is empty and not being modified\r\nParameters\r\nconst struct list_head * head\r\nthe list to test\r\nDescription\r\ntests whether a list is empty _and_ checks that no other CPU might be in the process of modifying either member\r\n(next or prev)\r\nNOTE\r\nusing list_empty_careful() without synchronization can only be safe if the only activity that can happen to the\r\nlist entry is list_del_init() . Eg. it cannot be used if another CPU could re- list_add() it.\r\nvoid list_rotate_left (struct list_head * head)¶\r\nrotate the list to the left\r\nParameters\r\nstruct list_head * head\r\nthe head of the list\r\nint list_is_singular (const struct list_head * head)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 3 of 228\n\ntests whether a list has just one entry.\r\nParameters\r\nconst struct list_head * head\r\nthe list to test.\r\nvoid list_cut_position (struct list_head * list, struct list_head * head, struct list_head * entry)¶\r\ncut a list into two\r\nParameters\r\nstruct list_head * list\r\na new list to add all removed entries\r\nstruct list_head * head\r\na list with entries\r\nstruct list_head * entry\r\nan entry within head, could be the head itself and if so we won’t cut the list\r\nDescription\r\nThis helper moves the initial part of head, up to and including entry, from head to list. You should pass on entry\r\nan element you know is on head. list should be an empty list or a list you do not care about losing its data.\r\nvoid list_splice (const struct list_head * list, struct list_head * head)¶\r\njoin two lists, this is designed for stacks\r\nParameters\r\nconst struct list_head * list\r\nthe new list to add.\r\nstruct list_head * head\r\nthe place to add it in the first list.\r\nvoid list_splice_tail (struct list_head * list, struct list_head * head)¶\r\njoin two lists, each list being a queue\r\nParameters\r\nstruct list_head * list\r\nthe new list to add.\r\nstruct list_head * head\r\nthe place to add it in the first list.\r\nvoid list_splice_init (struct list_head * list, struct list_head * head)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 4 of 228\n\njoin two lists and reinitialise the emptied list.\r\nParameters\r\nstruct list_head * list\r\nthe new list to add.\r\nstruct list_head * head\r\nthe place to add it in the first list.\r\nDescription\r\nThe list at list is reinitialised\r\nvoid list_splice_tail_init (struct list_head * list, struct list_head * head)¶\r\njoin two lists and reinitialise the emptied list\r\nParameters\r\nstruct list_head * list\r\nthe new list to add.\r\nstruct list_head * head\r\nthe place to add it in the first list.\r\nDescription\r\nEach of the lists is a queue. The list at list is reinitialised\r\nlist_entry (ptr, type, member)¶\r\nget the struct for this entry\r\nParameters\r\nptr\r\nthe struct list_head pointer.\r\ntype\r\nthe type of the struct this is embedded in.\r\nmember\r\nthe name of the list_head within the struct.\r\nlist_first_entry (ptr, type, member)¶\r\nget the first element from a list\r\nParameters\r\nptr\r\nthe list head to take the element from.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 5 of 228\n\ntype\r\nthe type of the struct this is embedded in.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nNote, that list is expected to be not empty.\r\nlist_last_entry (ptr, type, member)¶\r\nget the last element from a list\r\nParameters\r\nptr\r\nthe list head to take the element from.\r\ntype\r\nthe type of the struct this is embedded in.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nNote, that list is expected to be not empty.\r\nlist_first_entry_or_null (ptr, type, member)¶\r\nget the first element from a list\r\nParameters\r\nptr\r\nthe list head to take the element from.\r\ntype\r\nthe type of the struct this is embedded in.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nNote that if the list is empty, it returns NULL.\r\nlist_next_entry (pos, member)¶\r\nget the next element in list\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 6 of 228\n\npos\r\nthe type * to cursor\r\nmember\r\nthe name of the list_head within the struct.\r\nlist_prev_entry (pos, member)¶\r\nget the prev element in list\r\nParameters\r\npos\r\nthe type * to cursor\r\nmember\r\nthe name of the list_head within the struct.\r\nlist_for_each (pos, head)¶\r\niterate over a list\r\nParameters\r\npos\r\nthe struct list_head to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nlist_for_each_prev (pos, head)¶\r\niterate over a list backwards\r\nParameters\r\npos\r\nthe struct list_head to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nlist_for_each_safe (pos, n, head)¶\r\niterate over a list safe against removal of list entry\r\nParameters\r\npos\r\nthe struct list_head to use as a loop cursor.\r\nn\r\nanother struct list_head to use as temporary storage\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 7 of 228\n\nhead\r\nthe head for your list.\r\nlist_for_each_prev_safe (pos, n, head)¶\r\niterate over a list backwards safe against removal of list entry\r\nParameters\r\npos\r\nthe struct list_head to use as a loop cursor.\r\nn\r\nanother struct list_head to use as temporary storage\r\nhead\r\nthe head for your list.\r\nlist_for_each_entry (pos, head, member)¶\r\niterate over list of given type\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nlist_for_each_entry_reverse (pos, head, member)¶\r\niterate backwards over list of given type.\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nlist_prepare_entry (pos, head, member)¶\r\nprepare a pos entry for use in list_for_each_entry_continue()\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 8 of 228\n\npos\r\nthe type * to use as a start point\r\nhead\r\nthe head of the list\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nPrepares a pos entry for use as a start point in list_for_each_entry_continue() .\r\nlist_for_each_entry_continue (pos, head, member)¶\r\ncontinue iteration over list of given type\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nContinue to iterate over list of given type, continuing after the current position.\r\nlist_for_each_entry_continue_reverse (pos, head, member)¶\r\niterate backwards from the given point\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nStart to iterate over list of given type backwards, continuing after the current position.\r\nlist_for_each_entry_from (pos, head, member)¶\r\niterate over list of given type from the current point\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 9 of 228\n\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nIterate over list of given type, continuing from current position.\r\nlist_for_each_entry_from_reverse (pos, head, member)¶\r\niterate backwards over list of given type from the current point\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nIterate backwards over list of given type, continuing from current position.\r\nlist_for_each_entry_safe (pos, n, head, member)¶\r\niterate over list of given type safe against removal of list entry\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nn\r\nanother type * to use as temporary storage\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nlist_for_each_entry_safe_continue (pos, n, head, member)¶\r\ncontinue list iteration safe against removal\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 10 of 228\n\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nn\r\nanother type * to use as temporary storage\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nIterate over list of given type, continuing after current point, safe against removal of list entry.\r\nlist_for_each_entry_safe_from (pos, n, head, member)¶\r\niterate over list from current point safe against removal\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nn\r\nanother type * to use as temporary storage\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nIterate over list of given type from current point, safe against removal of list entry.\r\nlist_for_each_entry_safe_reverse (pos, n, head, member)¶\r\niterate backwards over list safe against removal\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nn\r\nanother type * to use as temporary storage\r\nhead\r\nthe head for your list.\r\nmember\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 11 of 228\n\nthe name of the list_head within the struct.\r\nDescription\r\nIterate backwards over list of given type, safe against removal of list entry.\r\nlist_safe_reset_next (pos, n, member)¶\r\nreset a stale list_for_each_entry_safe loop\r\nParameters\r\npos\r\nthe loop cursor used in the list_for_each_entry_safe loop\r\nn\r\ntemporary storage used in list_for_each_entry_safe\r\nmember\r\nthe name of the list_head within the struct.\r\nDescription\r\nlist_safe_reset_next is not safe to use in general if the list may be modified concurrently (eg. the lock is dropped in\r\nthe loop body). An exception to this is if the cursor element (pos) is pinned in the list, and list_safe_reset_next is\r\ncalled after re-taking the lock and before completing the current iteration of the loop body.\r\nhlist_for_each_entry (pos, head, member)¶\r\niterate over list of given type\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the hlist_node within the struct.\r\nhlist_for_each_entry_continue (pos, member)¶\r\niterate over a hlist continuing after current point\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nmember\r\nthe name of the hlist_node within the struct.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 12 of 228\n\nhlist_for_each_entry_from (pos, member)¶\r\niterate over a hlist continuing from current point\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nmember\r\nthe name of the hlist_node within the struct.\r\nhlist_for_each_entry_safe (pos, n, head, member)¶\r\niterate over list of given type safe against removal of list entry\r\nParameters\r\npos\r\nthe type * to use as a loop cursor.\r\nn\r\nanother struct hlist_node to use as temporary storage\r\nhead\r\nthe head for your list.\r\nmember\r\nthe name of the hlist_node within the struct.\r\nBasic C Library Functions¶\r\nWhen writing drivers, you cannot in general use routines which are from the C Library. Some of the functions\r\nhave been found generally useful and they are listed below. The behaviour of these functions may vary slightly\r\nfrom those defined by ANSI, and these deviations are noted in the text.\r\nString Conversions¶\r\nunsigned long long simple_strtoull (const char * cp, char ** endp, unsigned int base)¶\r\nconvert a string to an unsigned long long\r\nParameters\r\nconst char * cp\r\nThe start of the string\r\nchar ** endp\r\nA pointer to the end of the parsed string will be placed here\r\nunsigned int base\r\nThe number base to use\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 13 of 228\n\nDescription\r\nThis function is obsolete. Please use kstrtoull instead.\r\nunsigned long simple_strtoul (const char * cp, char ** endp, unsigned int base)¶\r\nconvert a string to an unsigned long\r\nParameters\r\nconst char * cp\r\nThe start of the string\r\nchar ** endp\r\nA pointer to the end of the parsed string will be placed here\r\nunsigned int base\r\nThe number base to use\r\nDescription\r\nThis function is obsolete. Please use kstrtoul instead.\r\nlong simple_strtol (const char * cp, char ** endp, unsigned int base)¶\r\nconvert a string to a signed long\r\nParameters\r\nconst char * cp\r\nThe start of the string\r\nchar ** endp\r\nA pointer to the end of the parsed string will be placed here\r\nunsigned int base\r\nThe number base to use\r\nDescription\r\nThis function is obsolete. Please use kstrtol instead.\r\nlong long simple_strtoll (const char * cp, char ** endp, unsigned int base)¶\r\nconvert a string to a signed long long\r\nParameters\r\nconst char * cp\r\nThe start of the string\r\nchar ** endp\r\nA pointer to the end of the parsed string will be placed here\r\nunsigned int base\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 14 of 228\n\nThe number base to use\r\nDescription\r\nThis function is obsolete. Please use kstrtoll instead.\r\nint vsnprintf (char * buf, size_t size, const char * fmt, va_list args)¶\r\nFormat a string and place it in a buffer\r\nParameters\r\nchar * buf\r\nThe buffer to place the result into\r\nsize_t size\r\nThe size of the buffer, including the trailing null space\r\nconst char * fmt\r\nThe format string to use\r\nva_list args\r\nArguments for the format string\r\nDescription\r\nThis function generally follows C99 vsnprintf, but has some extensions and a few limitations:\r\n``n`` is unsupported\r\n``p``* is handled by pointer()\r\nSee pointer() or Documentation/printk-formats.txt for more extensive description.\r\nPlease update the documentation in both places when making changes\r\nThe return value is the number of characters which would be generated for the given input, excluding the trailing\r\n‘0’, as per ISO C99. If you want to have the exact number of characters written into buf as return value (not\r\nincluding the trailing ‘0’), use vscnprintf() . If the return is greater than or equal to size, the resulting string is\r\ntruncated.\r\nIf you’re not already dealing with a va_list consider using snprintf() .\r\nint vscnprintf (char * buf, size_t size, const char * fmt, va_list args)¶\r\nFormat a string and place it in a buffer\r\nParameters\r\nchar * buf\r\nThe buffer to place the result into\r\nsize_t size\r\nThe size of the buffer, including the trailing null space\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 15 of 228\n\nconst char * fmt\r\nThe format string to use\r\nva_list args\r\nArguments for the format string\r\nDescription\r\nThe return value is the number of characters which have been written into the buf not including the trailing ‘0’. If\r\nsize is == 0 the function returns 0.\r\nIf you’re not already dealing with a va_list consider using scnprintf() .\r\nSee the vsnprintf() documentation for format string extensions over C99.\r\nint snprintf (char * buf, size_t size, const char * fmt, ...)¶\r\nFormat a string and place it in a buffer\r\nParameters\r\nchar * buf\r\nThe buffer to place the result into\r\nsize_t size\r\nThe size of the buffer, including the trailing null space\r\nconst char * fmt\r\nThe format string to use\r\n...\r\nArguments for the format string\r\nDescription\r\nThe return value is the number of characters which would be generated for the given input, excluding the trailing\r\nnull, as per ISO C99. If the return is greater than or equal to size, the resulting string is truncated.\r\nSee the vsnprintf() documentation for format string extensions over C99.\r\nint scnprintf (char * buf, size_t size, const char * fmt, ...)¶\r\nFormat a string and place it in a buffer\r\nParameters\r\nchar * buf\r\nThe buffer to place the result into\r\nsize_t size\r\nThe size of the buffer, including the trailing null space\r\nconst char * fmt\r\nThe format string to use\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 16 of 228\n\n...\r\nArguments for the format string\r\nDescription\r\nThe return value is the number of characters written into buf not including the trailing ‘0’. If size is == 0 the\r\nfunction returns 0.\r\nint vsprintf (char * buf, const char * fmt, va_list args)¶\r\nFormat a string and place it in a buffer\r\nParameters\r\nchar * buf\r\nThe buffer to place the result into\r\nconst char * fmt\r\nThe format string to use\r\nva_list args\r\nArguments for the format string\r\nDescription\r\nThe function returns the number of characters written into buf. Use vsnprintf() or vscnprintf() in order to\r\navoid buffer overflows.\r\nIf you’re not already dealing with a va_list consider using sprintf() .\r\nSee the vsnprintf() documentation for format string extensions over C99.\r\nint sprintf (char * buf, const char * fmt, ...)¶\r\nFormat a string and place it in a buffer\r\nParameters\r\nchar * buf\r\nThe buffer to place the result into\r\nconst char * fmt\r\nThe format string to use\r\n...\r\nArguments for the format string\r\nDescription\r\nThe function returns the number of characters written into buf. Use snprintf() or scnprintf() in order to\r\navoid buffer overflows.\r\nSee the vsnprintf() documentation for format string extensions over C99.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 17 of 228\n\nint vbin_printf (u32 * bin_buf, size_t size, const char * fmt, va_list args)¶\r\nParse a format string and place args’ binary value in a buffer\r\nParameters\r\nu32 * bin_buf\r\nThe buffer to place args’ binary value\r\nsize_t size\r\nThe size of the buffer(by words(32bits), not characters)\r\nconst char * fmt\r\nThe format string to use\r\nva_list args\r\nArguments for the format string\r\nDescription\r\nThe format follows C99 vsnprintf, except n is ignored, and its argument is skipped.\r\nThe return value is the number of words(32bits) which would be generated for the given input.\r\nNOTE\r\nIf the return value is greater than size, the resulting bin_buf is NOT valid for bstr_printf() .\r\nint bstr_printf (char * buf, size_t size, const char * fmt, const u32 * bin_buf)¶\r\nFormat a string from binary arguments and place it in a buffer\r\nParameters\r\nchar * buf\r\nThe buffer to place the result into\r\nsize_t size\r\nThe size of the buffer, including the trailing null space\r\nconst char * fmt\r\nThe format string to use\r\nconst u32 * bin_buf\r\nBinary arguments for the format string\r\nDescription\r\nThis function like C99 vsnprintf, but the difference is that vsnprintf gets arguments from stack, and bstr_printf\r\ngets arguments from bin_buf which is a binary buffer that generated by vbin_printf.\r\nThe format follows C99 vsnprintf, but has some extensions:\r\nsee vsnprintf comment for details.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 18 of 228\n\nThe return value is the number of characters which would be generated for the given input, excluding the trailing\r\n‘0’, as per ISO C99. If you want to have the exact number of characters written into buf as return value (not\r\nincluding the trailing ‘0’), use vscnprintf() . If the return is greater than or equal to size, the resulting string is\r\ntruncated.\r\nint bprintf (u32 * bin_buf, size_t size, const char * fmt, ...)¶\r\nParse a format string and place args’ binary value in a buffer\r\nParameters\r\nu32 * bin_buf\r\nThe buffer to place args’ binary value\r\nsize_t size\r\nThe size of the buffer(by words(32bits), not characters)\r\nconst char * fmt\r\nThe format string to use\r\n...\r\nArguments for the format string\r\nDescription\r\nThe function returns the number of words(u32) written into bin_buf.\r\nint vsscanf (const char * buf, const char * fmt, va_list args)¶\r\nUnformat a buffer into a list of arguments\r\nParameters\r\nconst char * buf\r\ninput buffer\r\nconst char * fmt\r\nformat of buffer\r\nva_list args\r\narguments\r\nint sscanf (const char * buf, const char * fmt, ...)¶\r\nUnformat a buffer into a list of arguments\r\nParameters\r\nconst char * buf\r\ninput buffer\r\nconst char * fmt\r\nformatting of buffer\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 19 of 228\n\n...\r\nresulting arguments\r\nint kstrtol (const char * s, unsigned int base, long * res)¶\r\nconvert a string to a long\r\nParameters\r\nconst char * s\r\nThe start of the string. The string must be null-terminated, and may also include a single newline before its\r\nterminating null. The first character may also be a plus sign or a minus sign.\r\nunsigned int base\r\nThe number base to use. The maximum supported base is 16. If base is given as 0, then the base of the\r\nstring is automatically detected with the conventional semantics - If it begins with 0x the number will be\r\nparsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal\r\nnumber. Otherwise it will be parsed as a decimal.\r\nlong * res\r\nWhere to write the result of the conversion on success.\r\nDescription\r\nReturns 0 on success, -ERANGE on overflow and -EINVAL on parsing error. Used as a replacement for the\r\nobsolete simple_strtoull. Return code must be checked.\r\nint kstrtoul (const char * s, unsigned int base, unsigned long * res)¶\r\nconvert a string to an unsigned long\r\nParameters\r\nconst char * s\r\nThe start of the string. The string must be null-terminated, and may also include a single newline before its\r\nterminating null. The first character may also be a plus sign, but not a minus sign.\r\nunsigned int base\r\nThe number base to use. The maximum supported base is 16. If base is given as 0, then the base of the\r\nstring is automatically detected with the conventional semantics - If it begins with 0x the number will be\r\nparsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal\r\nnumber. Otherwise it will be parsed as a decimal.\r\nunsigned long * res\r\nWhere to write the result of the conversion on success.\r\nDescription\r\nReturns 0 on success, -ERANGE on overflow and -EINVAL on parsing error. Used as a replacement for the\r\nobsolete simple_strtoull. Return code must be checked.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 20 of 228\n\nint kstrtoull (const char * s, unsigned int base, unsigned long long * res)¶\r\nconvert a string to an unsigned long long\r\nParameters\r\nconst char * s\r\nThe start of the string. The string must be null-terminated, and may also include a single newline before its\r\nterminating null. The first character may also be a plus sign, but not a minus sign.\r\nunsigned int base\r\nThe number base to use. The maximum supported base is 16. If base is given as 0, then the base of the\r\nstring is automatically detected with the conventional semantics - If it begins with 0x the number will be\r\nparsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal\r\nnumber. Otherwise it will be parsed as a decimal.\r\nunsigned long long * res\r\nWhere to write the result of the conversion on success.\r\nDescription\r\nReturns 0 on success, -ERANGE on overflow and -EINVAL on parsing error. Used as a replacement for the\r\nobsolete simple_strtoull. Return code must be checked.\r\nint kstrtoll (const char * s, unsigned int base, long long * res)¶\r\nconvert a string to a long long\r\nParameters\r\nconst char * s\r\nThe start of the string. The string must be null-terminated, and may also include a single newline before its\r\nterminating null. The first character may also be a plus sign or a minus sign.\r\nunsigned int base\r\nThe number base to use. The maximum supported base is 16. If base is given as 0, then the base of the\r\nstring is automatically detected with the conventional semantics - If it begins with 0x the number will be\r\nparsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal\r\nnumber. Otherwise it will be parsed as a decimal.\r\nlong long * res\r\nWhere to write the result of the conversion on success.\r\nDescription\r\nReturns 0 on success, -ERANGE on overflow and -EINVAL on parsing error. Used as a replacement for the\r\nobsolete simple_strtoull. Return code must be checked.\r\nint kstrtouint (const char * s, unsigned int base, unsigned int * res)¶\r\nconvert a string to an unsigned int\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 21 of 228\n\nParameters\r\nconst char * s\r\nThe start of the string. The string must be null-terminated, and may also include a single newline before its\r\nterminating null. The first character may also be a plus sign, but not a minus sign.\r\nunsigned int base\r\nThe number base to use. The maximum supported base is 16. If base is given as 0, then the base of the\r\nstring is automatically detected with the conventional semantics - If it begins with 0x the number will be\r\nparsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal\r\nnumber. Otherwise it will be parsed as a decimal.\r\nunsigned int * res\r\nWhere to write the result of the conversion on success.\r\nDescription\r\nReturns 0 on success, -ERANGE on overflow and -EINVAL on parsing error. Used as a replacement for the\r\nobsolete simple_strtoull. Return code must be checked.\r\nint kstrtoint (const char * s, unsigned int base, int * res)¶\r\nconvert a string to an int\r\nParameters\r\nconst char * s\r\nThe start of the string. The string must be null-terminated, and may also include a single newline before its\r\nterminating null. The first character may also be a plus sign or a minus sign.\r\nunsigned int base\r\nThe number base to use. The maximum supported base is 16. If base is given as 0, then the base of the\r\nstring is automatically detected with the conventional semantics - If it begins with 0x the number will be\r\nparsed as a hexadecimal (case insensitive), if it otherwise begins with 0, it will be parsed as an octal\r\nnumber. Otherwise it will be parsed as a decimal.\r\nint * res\r\nWhere to write the result of the conversion on success.\r\nDescription\r\nReturns 0 on success, -ERANGE on overflow and -EINVAL on parsing error. Used as a replacement for the\r\nobsolete simple_strtoull. Return code must be checked.\r\nint kstrtobool (const char * s, bool * res)¶\r\nconvert common user inputs into boolean values\r\nParameters\r\nconst char * s\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 22 of 228\n\ninput string\r\nbool * res\r\nresult\r\nDescription\r\nThis routine returns 0 iff the first character is one of ‘Yy1Nn0’, or [oO][NnFf] for “on” and “off”. Otherwise it\r\nwill return -EINVAL. Value pointed to by res is updated upon finding a match.\r\nString Manipulation¶\r\nint strncasecmp (const char * s1, const char * s2, size_t len)¶\r\nCase insensitive, length-limited string comparison\r\nParameters\r\nconst char * s1\r\nOne string\r\nconst char * s2\r\nThe other string\r\nsize_t len\r\nthe maximum number of characters to compare\r\nchar * strcpy (char * dest, const char * src)¶\r\nCopy a NUL terminated string\r\nParameters\r\nchar * dest\r\nWhere to copy the string to\r\nconst char * src\r\nWhere to copy the string from\r\nchar * strncpy (char * dest, const char * src, size_t count)¶\r\nCopy a length-limited, C-string\r\nParameters\r\nchar * dest\r\nWhere to copy the string to\r\nconst char * src\r\nWhere to copy the string from\r\nsize_t count\r\nThe maximum number of bytes to copy\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 23 of 228\n\nDescription\r\nThe result is not NUL-terminated if the source exceeds count bytes.\r\nIn the case where the length of src is less than that of count, the remainder of dest will be padded with NUL .\r\nsize_t strlcpy (char * dest, const char * src, size_t size)¶\r\nCopy a C-string into a sized buffer\r\nParameters\r\nchar * dest\r\nWhere to copy the string to\r\nconst char * src\r\nWhere to copy the string from\r\nsize_t size\r\nsize of destination buffer\r\nDescription\r\nCompatible with *BSD : the result is always a valid NUL-terminated string that fits in the buffer (unless, of\r\ncourse, the buffer size is zero). It does not pad out the result like strncpy() does.\r\nssize_t strscpy (char * dest, const char * src, size_t count)¶\r\nCopy a C-string into a sized buffer\r\nParameters\r\nchar * dest\r\nWhere to copy the string to\r\nconst char * src\r\nWhere to copy the string from\r\nsize_t count\r\nSize of destination buffer\r\nDescription\r\nCopy the string, or as much of it as fits, into the dest buffer. The routine returns the number of characters copied\r\n(not including the trailing NUL) or -E2BIG if the destination buffer wasn’t big enough. The behavior is undefined\r\nif the string buffers overlap. The destination buffer is always NUL terminated, unless it’s zero-sized.\r\nPreferred to strlcpy() since the API doesn’t require reading memory from the src string beyond the specified\r\n“count” bytes, and since the return value is easier to error-check than strlcpy() ‘s. In addition, the\r\nimplementation is robust to the string changing out from underneath it, unlike the current strlcpy()\r\nimplementation.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 24 of 228\n\nPreferred to strncpy() since it always returns a valid string, and doesn’t unnecessarily force the tail of the\r\ndestination buffer to be zeroed. If the zeroing is desired, it’s likely cleaner to use strscpy() with an overflow\r\ntest, then just memset() the tail of the dest buffer.\r\nchar * strcat (char * dest, const char * src)¶\r\nAppend one NUL-terminated string to another\r\nParameters\r\nchar * dest\r\nThe string to be appended to\r\nconst char * src\r\nThe string to append to it\r\nchar * strncat (char * dest, const char * src, size_t count)¶\r\nAppend a length-limited, C-string to another\r\nParameters\r\nchar * dest\r\nThe string to be appended to\r\nconst char * src\r\nThe string to append to it\r\nsize_t count\r\nThe maximum numbers of bytes to copy\r\nDescription\r\nNote that in contrast to strncpy() , strncat() ensures the result is terminated.\r\nsize_t strlcat (char * dest, const char * src, size_t count)¶\r\nAppend a length-limited, C-string to another\r\nParameters\r\nchar * dest\r\nThe string to be appended to\r\nconst char * src\r\nThe string to append to it\r\nsize_t count\r\nThe size of the destination buffer.\r\nint strcmp (const char * cs, const char * ct)¶\r\nCompare two strings\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 25 of 228\n\nParameters\r\nconst char * cs\r\nOne string\r\nconst char * ct\r\nAnother string\r\nint strncmp (const char * cs, const char * ct, size_t count)¶\r\nCompare two length-limited strings\r\nParameters\r\nconst char * cs\r\nOne string\r\nconst char * ct\r\nAnother string\r\nsize_t count\r\nThe maximum number of bytes to compare\r\nchar * strchr (const char * s, int c)¶\r\nFind the first occurrence of a character in a string\r\nParameters\r\nconst char * s\r\nThe string to be searched\r\nint c\r\nThe character to search for\r\nchar * strchrnul (const char * s, int c)¶\r\nFind and return a character in a string, or end of string\r\nParameters\r\nconst char * s\r\nThe string to be searched\r\nint c\r\nThe character to search for\r\nDescription\r\nReturns pointer to first occurrence of ‘c’ in s. If c is not found, then return a pointer to the null byte at the end of s.\r\nchar * strrchr (const char * s, int c)¶\r\nFind the last occurrence of a character in a string\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 26 of 228\n\nParameters\r\nconst char * s\r\nThe string to be searched\r\nint c\r\nThe character to search for\r\nchar * strnchr (const char * s, size_t count, int c)¶\r\nFind a character in a length limited string\r\nParameters\r\nconst char * s\r\nThe string to be searched\r\nsize_t count\r\nThe number of characters to be searched\r\nint c\r\nThe character to search for\r\nchar * skip_spaces (const char * str)¶\r\nRemoves leading whitespace from str.\r\nParameters\r\nconst char * str\r\nThe string to be stripped.\r\nDescription\r\nReturns a pointer to the first non-whitespace character in str.\r\nchar * strim (char * s)¶\r\nRemoves leading and trailing whitespace from s.\r\nParameters\r\nchar * s\r\nThe string to be stripped.\r\nDescription\r\nNote that the first trailing whitespace is replaced with a NUL-terminator in the given string s. Returns a pointer\r\nto the first non-whitespace character in s.\r\nsize_t strlen (const char * s)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 27 of 228\n\nFind the length of a string\r\nParameters\r\nconst char * s\r\nThe string to be sized\r\nsize_t strnlen (const char * s, size_t count)¶\r\nFind the length of a length-limited string\r\nParameters\r\nconst char * s\r\nThe string to be sized\r\nsize_t count\r\nThe maximum number of bytes to search\r\nsize_t strspn (const char * s, const char * accept)¶\r\nCalculate the length of the initial substring of s which only contain letters in accept\r\nParameters\r\nconst char * s\r\nThe string to be searched\r\nconst char * accept\r\nThe string to search for\r\nsize_t strcspn (const char * s, const char * reject)¶\r\nCalculate the length of the initial substring of s which does not contain letters in reject\r\nParameters\r\nconst char * s\r\nThe string to be searched\r\nconst char * reject\r\nThe string to avoid\r\nchar * strpbrk (const char * cs, const char * ct)¶\r\nFind the first occurrence of a set of characters\r\nParameters\r\nconst char * cs\r\nThe string to be searched\r\nconst char * ct\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 28 of 228\n\nThe characters to search for\r\nchar * strsep (char ** s, const char * ct)¶\r\nSplit a string into tokens\r\nParameters\r\nchar ** s\r\nThe string to be searched\r\nconst char * ct\r\nThe characters to search for\r\nDescription\r\nstrsep() updates s to point after the token, ready for the next call.\r\nIt returns empty tokens, too, behaving exactly like the libc function of that name. In fact, it was stolen from glibc2\r\nand de-fancy-fied. Same semantics, slimmer shape. ;)\r\nbool sysfs_streq (const char * s1, const char * s2)¶\r\nreturn true if strings are equal, modulo trailing newline\r\nParameters\r\nconst char * s1\r\none string\r\nconst char * s2\r\nanother string\r\nDescription\r\nThis routine returns true iff two strings are equal, treating both NUL and newline-then-NUL as equivalent string\r\nterminations. It’s geared for use with sysfs input strings, which generally terminate with newlines but are\r\ncompared against values without newlines.\r\nint match_string (const char *const * array, size_t n, const char * string)¶\r\nmatches given string in an array\r\nParameters\r\nconst char *const * array\r\narray of strings\r\nsize_t n\r\nnumber of strings in the array or -1 for NULL terminated arrays\r\nconst char * string\r\nstring to match with\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 29 of 228\n\nReturn\r\nindex of a string in the array if matches, or -EINVAL otherwise.\r\nint __sysfs_match_string (const char *const * array, size_t n, const char * str)¶\r\nmatches given string in an array\r\nParameters\r\nconst char *const * array\r\narray of strings\r\nsize_t n\r\nnumber of strings in the array or -1 for NULL terminated arrays\r\nconst char * str\r\nstring to match with\r\nDescription\r\nReturns index of str in the array or -EINVAL, just like match_string() . Uses sysfs_streq instead of strcmp for\r\nmatching.\r\nvoid * memset (void * s, int c, size_t count)¶\r\nFill a region of memory with the given value\r\nParameters\r\nvoid * s\r\nPointer to the start of the area.\r\nint c\r\nThe byte to fill the area with\r\nsize_t count\r\nThe size of the area.\r\nDescription\r\nDo not use memset() to access IO space, use memset_io() instead.\r\nvoid memzero_explicit (void * s, size_t count)¶\r\nFill a region of memory (e.g. sensitive keying data) with 0s.\r\nParameters\r\nvoid * s\r\nPointer to the start of the area.\r\nsize_t count\r\nThe size of the area.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 30 of 228\n\nNote\r\nusually using memset() is just fine (!), but in cases where clearing out _local_ data at the end of a scope is\r\nnecessary, memzero_explicit() should be used instead in order to prevent the compiler from optimising away\r\nzeroing.\r\nmemzero_explicit() doesn’t need an arch-specific version as it just invokes the one of memset() implicitly.\r\nvoid * memcpy (void * dest, const void * src, size_t count)¶\r\nCopy one area of memory to another\r\nParameters\r\nvoid * dest\r\nWhere to copy to\r\nconst void * src\r\nWhere to copy from\r\nsize_t count\r\nThe size of the area.\r\nDescription\r\nYou should not use this function to access IO space, use memcpy_toio() or memcpy_fromio() instead.\r\nvoid * memmove (void * dest, const void * src, size_t count)¶\r\nCopy one area of memory to another\r\nParameters\r\nvoid * dest\r\nWhere to copy to\r\nconst void * src\r\nWhere to copy from\r\nsize_t count\r\nThe size of the area.\r\nDescription\r\nUnlike memcpy() , memmove() copes with overlapping areas.\r\n__visible int memcmp (const void * cs, const void * ct, size_t count)¶\r\nCompare two areas of memory\r\nParameters\r\nconst void * cs\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 31 of 228\n\nOne area of memory\r\nconst void * ct\r\nAnother area of memory\r\nsize_t count\r\nThe size of the area.\r\nvoid * memscan (void * addr, int c, size_t size)¶\r\nFind a character in an area of memory.\r\nParameters\r\nvoid * addr\r\nThe memory area\r\nint c\r\nThe byte to search for\r\nsize_t size\r\nThe size of the area.\r\nDescription\r\nreturns the address of the first occurrence of c, or 1 byte past the area if c is not found\r\nchar * strstr (const char * s1, const char * s2)¶\r\nFind the first substring in a NUL terminated string\r\nParameters\r\nconst char * s1\r\nThe string to be searched\r\nconst char * s2\r\nThe string to search for\r\nchar * strnstr (const char * s1, const char * s2, size_t len)¶\r\nFind the first substring in a length-limited string\r\nParameters\r\nconst char * s1\r\nThe string to be searched\r\nconst char * s2\r\nThe string to search for\r\nsize_t len\r\nthe maximum number of characters to search\r\nvoid * memchr (const void * s, int c, size_t n)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 32 of 228\n\nFind a character in an area of memory.\r\nParameters\r\nconst void * s\r\nThe memory area\r\nint c\r\nThe byte to search for\r\nsize_t n\r\nThe size of the area.\r\nDescription\r\nreturns the address of the first occurrence of c, or NULL if c is not found\r\nvoid * memchr_inv (const void * start, int c, size_t bytes)¶\r\nFind an unmatching character in an area of memory.\r\nParameters\r\nconst void * start\r\nThe memory area\r\nint c\r\nFind a character other than c\r\nsize_t bytes\r\nThe size of the area.\r\nDescription\r\nreturns the address of the first character other than c, or NULL if the whole buffer contains just c.\r\nchar * strreplace (char * s, char old, char new)¶\r\nReplace all occurrences of character in string.\r\nParameters\r\nchar * s\r\nThe string to operate on.\r\nchar old\r\nThe character being replaced.\r\nchar new\r\nThe character old is replaced with.\r\nDescription\r\nReturns pointer to the nul byte at the end of s.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 33 of 228\n\nBit Operations¶\r\nvoid set_bit (long nr, volatile unsigned long * addr)¶\r\nAtomically set a bit in memory\r\nParameters\r\nlong nr\r\nthe bit to set\r\nvolatile unsigned long * addr\r\nthe address to start counting from\r\nDescription\r\nThis function is atomic and may not be reordered. See __set_bit() if you do not require the atomic guarantees.\r\nNote\r\nthere are no guarantees that this function will not be reordered on non x86 architectures, so if you are writing\r\nportable code, make sure not to rely on its reordering guarantees.\r\nNote that nr may be almost arbitrarily large; this function is not restricted to acting on a single-word quantity.\r\nvoid __set_bit (long nr, volatile unsigned long * addr)¶\r\nSet a bit in memory\r\nParameters\r\nlong nr\r\nthe bit to set\r\nvolatile unsigned long * addr\r\nthe address to start counting from\r\nDescription\r\nUnlike set_bit() , this function is non-atomic and may be reordered. If it’s called on the same region of memory\r\nsimultaneously, the effect may be that only one operation succeeds.\r\nvoid clear_bit (long nr, volatile unsigned long * addr)¶\r\nClears a bit in memory\r\nParameters\r\nlong nr\r\nBit to clear\r\nvolatile unsigned long * addr\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 34 of 228\n\nAddress to start counting from\r\nDescription\r\nclear_bit() is atomic and may not be reordered. However, it does not contain a memory barrier, so if it is used\r\nfor locking purposes, you should call smp_mb__before_atomic() and/or smp_mb__after_atomic() in order to\r\nensure changes are visible on other processors.\r\nvoid __change_bit (long nr, volatile unsigned long * addr)¶\r\nToggle a bit in memory\r\nParameters\r\nlong nr\r\nthe bit to change\r\nvolatile unsigned long * addr\r\nthe address to start counting from\r\nDescription\r\nUnlike change_bit() , this function is non-atomic and may be reordered. If it’s called on the same region of\r\nmemory simultaneously, the effect may be that only one operation succeeds.\r\nvoid change_bit (long nr, volatile unsigned long * addr)¶\r\nToggle a bit in memory\r\nParameters\r\nlong nr\r\nBit to change\r\nvolatile unsigned long * addr\r\nAddress to start counting from\r\nDescription\r\nchange_bit() is atomic and may not be reordered. Note that nr may be almost arbitrarily large; this function is\r\nnot restricted to acting on a single-word quantity.\r\nbool test_and_set_bit (long nr, volatile unsigned long * addr)¶\r\nSet a bit and return its old value\r\nParameters\r\nlong nr\r\nBit to set\r\nvolatile unsigned long * addr\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 35 of 228\n\nAddress to count from\r\nDescription\r\nThis operation is atomic and cannot be reordered. It also implies a memory barrier.\r\nbool test_and_set_bit_lock (long nr, volatile unsigned long * addr)¶\r\nSet a bit and return its old value for lock\r\nParameters\r\nlong nr\r\nBit to set\r\nvolatile unsigned long * addr\r\nAddress to count from\r\nDescription\r\nThis is the same as test_and_set_bit on x86.\r\nbool __test_and_set_bit (long nr, volatile unsigned long * addr)¶\r\nSet a bit and return its old value\r\nParameters\r\nlong nr\r\nBit to set\r\nvolatile unsigned long * addr\r\nAddress to count from\r\nDescription\r\nThis operation is non-atomic and can be reordered. If two examples of this operation race, one can appear to\r\nsucceed but actually fail. You must protect multiple accesses with a lock.\r\nbool test_and_clear_bit (long nr, volatile unsigned long * addr)¶\r\nClear a bit and return its old value\r\nParameters\r\nlong nr\r\nBit to clear\r\nvolatile unsigned long * addr\r\nAddress to count from\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 36 of 228\n\nThis operation is atomic and cannot be reordered. It also implies a memory barrier.\r\nbool __test_and_clear_bit (long nr, volatile unsigned long * addr)¶\r\nClear a bit and return its old value\r\nParameters\r\nlong nr\r\nBit to clear\r\nvolatile unsigned long * addr\r\nAddress to count from\r\nDescription\r\nThis operation is non-atomic and can be reordered. If two examples of this operation race, one can appear to\r\nsucceed but actually fail. You must protect multiple accesses with a lock.\r\nNote\r\nthe operation is performed atomically with respect to the local CPU, but not other CPUs. Portable code should not\r\nrely on this behaviour. KVM relies on this behaviour on x86 for modifying memory that is also accessed from a\r\nhypervisor on the same CPU if running in a VM: don’t change this without also updating arch/x86/kernel/kvm.c\r\nbool test_and_change_bit (long nr, volatile unsigned long * addr)¶\r\nChange a bit and return its old value\r\nParameters\r\nlong nr\r\nBit to change\r\nvolatile unsigned long * addr\r\nAddress to count from\r\nDescription\r\nThis operation is atomic and cannot be reordered. It also implies a memory barrier.\r\nbool test_bit (int nr, const volatile unsigned long * addr)¶\r\nDetermine whether a bit is set\r\nParameters\r\nint nr\r\nbit number to test\r\nconst volatile unsigned long * addr\r\nAddress to start counting from\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 37 of 228\n\nunsigned long __ffs (unsigned long word)¶\r\nfind first set bit in word\r\nParameters\r\nunsigned long word\r\nThe word to search\r\nDescription\r\nUndefined if no bit exists, so code should check against 0 first.\r\nunsigned long ffz (unsigned long word)¶\r\nfind first zero bit in word\r\nParameters\r\nunsigned long word\r\nThe word to search\r\nDescription\r\nUndefined if no zero exists, so code should check against ~0UL first.\r\nint ffs (int x)¶\r\nfind first set bit in word\r\nParameters\r\nint x\r\nthe word to search\r\nDescription\r\nThis is defined the same way as the libc and compiler builtin ffs routines, therefore differs in spirit from the other\r\nbitops.\r\nffs(value) returns 0 if value is 0 or the position of the first set bit if value is nonzero. The first (least significant) bit\r\nis at position 1.\r\nint fls (int x)¶\r\nfind last set bit in word\r\nParameters\r\nint x\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 38 of 228\n\nthe word to search\r\nDescription\r\nThis is defined in a similar way as the libc and compiler builtin ffs, but returns the position of the most significant\r\nset bit.\r\nfls(value) returns 0 if value is 0 or the position of the last set bit if value is nonzero. The last (most significant) bit\r\nis at position 32.\r\nint fls64 (__u64 x)¶\r\nfind last set bit in a 64-bit word\r\nParameters\r\n__u64 x\r\nthe word to search\r\nDescription\r\nThis is defined in a similar way as the libc and compiler builtin ffsll, but returns the position of the most\r\nsignificant set bit.\r\nfls64(value) returns 0 if value is 0 or the position of the last set bit if value is nonzero. The last (most significant)\r\nbit is at position 64.\r\nBasic Kernel Library Functions¶\r\nThe Linux kernel provides more basic utility functions.\r\nBitmap Operations¶\r\nvoid __bitmap_shift_right (unsigned long * dst, const unsigned long * src, unsigned shift, unsigned nbits)¶\r\nlogical right shift of the bits in a bitmap\r\nParameters\r\nunsigned long * dst\r\ndestination bitmap\r\nconst unsigned long * src\r\nsource bitmap\r\nunsigned shift\r\nshift by this many bits\r\nunsigned nbits\r\nbitmap size, in bits\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 39 of 228\n\nDescription\r\nShifting right (dividing) means moving bits in the MS -\u003e LS bit direction. Zeros are fed into the vacated MS\r\npositions and the LS bits shifted off the bottom are lost.\r\nvoid __bitmap_shift_left (unsigned long * dst, const unsigned long * src, unsigned int shift, unsigned\r\nint nbits)¶\r\nlogical left shift of the bits in a bitmap\r\nParameters\r\nunsigned long * dst\r\ndestination bitmap\r\nconst unsigned long * src\r\nsource bitmap\r\nunsigned int shift\r\nshift by this many bits\r\nunsigned int nbits\r\nbitmap size, in bits\r\nDescription\r\nShifting left (multiplying) means moving bits in the LS -\u003e MS direction. Zeros are fed into the vacated LS bit\r\npositions and those MS bits shifted off the top are lost.\r\nunsigned long bitmap_find_next_zero_area_off (unsigned long * map, unsigned long size, unsigned long start,\r\nunsigned int nr, unsigned long align_mask, unsigned long align_offset)¶\r\nfind a contiguous aligned zero area\r\nParameters\r\nunsigned long * map\r\nThe address to base the search on\r\nunsigned long size\r\nThe bitmap size in bits\r\nunsigned long start\r\nThe bitnumber to start searching at\r\nunsigned int nr\r\nThe number of zeroed bits we’re looking for\r\nunsigned long align_mask\r\nAlignment mask for zero area\r\nunsigned long align_offset\r\nAlignment offset for zero area.\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 40 of 228\n\nThe align_mask should be one less than a power of 2; the effect is that the bit offset of all zero areas this function\r\nfinds plus align_offset is multiple of that power of 2.\r\nint __bitmap_parse (const char * buf, unsigned int buflen, int is_user, unsigned long * maskp, int nmaskbits)¶\r\nconvert an ASCII hex string into a bitmap.\r\nParameters\r\nconst char * buf\r\npointer to buffer containing string.\r\nunsigned int buflen\r\nbuffer size in bytes. If string is smaller than this then it must be terminated with a 0.\r\nint is_user\r\nlocation of buffer, 0 indicates kernel space\r\nunsigned long * maskp\r\npointer to bitmap array that will contain result.\r\nint nmaskbits\r\nsize of bitmap, in bits.\r\nDescription\r\nCommas group hex digits into chunks. Each chunk defines exactly 32 bits of the resultant bitmask. No chunk may\r\nspecify a value larger than 32 bits ( -EOVERFLOW ), and if a chunk specifies a smaller value then leading 0-bits are\r\nprepended. -EINVAL is returned for illegal characters and for grouping errors such as “1,,5”, ”,44”, ”,” and “”.\r\nLeading and trailing whitespace accepted, but not embedded whitespace.\r\nint bitmap_parse_user (const char __user * ubuf, unsigned int ulen, unsigned long * maskp, int nmaskbits)¶\r\nconvert an ASCII hex string in a user buffer into a bitmap\r\nParameters\r\nconst char __user * ubuf\r\npointer to user buffer containing string.\r\nunsigned int ulen\r\nbuffer size in bytes. If string is smaller than this then it must be terminated with a 0.\r\nunsigned long * maskp\r\npointer to bitmap array that will contain result.\r\nint nmaskbits\r\nsize of bitmap, in bits.\r\nDescription\r\nWrapper for __bitmap_parse() , providing it with user buffer.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 41 of 228\n\nWe cannot have this as an inline function in bitmap.h because it needs linux/uaccess.h to get the access_ok()\r\ndeclaration and this causes cyclic dependencies.\r\nint bitmap_print_to_pagebuf (bool list, char * buf, const unsigned long * maskp, int nmaskbits)¶\r\nconvert bitmap to list or hex format ASCII string\r\nParameters\r\nbool list\r\nindicates whether the bitmap must be list\r\nchar * buf\r\npage aligned buffer into which string is placed\r\nconst unsigned long * maskp\r\npointer to bitmap to convert\r\nint nmaskbits\r\nsize of bitmap, in bits\r\nDescription\r\nOutput format is a comma-separated list of decimal numbers and ranges if list is specified or hex digits grouped\r\ninto comma-separated sets of 8 digits/set. Returns the number of characters written to buf.\r\nIt is assumed that buf is a pointer into a PAGE_SIZE area and that sufficient storage remains at buf to\r\naccommodate the bitmap_print_to_pagebuf() output.\r\nint bitmap_parselist_user (const char __user * ubuf, unsigned int ulen, unsigned long * maskp, int nmaskbits)¶\r\nParameters\r\nconst char __user * ubuf\r\npointer to user buffer containing string.\r\nunsigned int ulen\r\nbuffer size in bytes. If string is smaller than this then it must be terminated with a 0.\r\nunsigned long * maskp\r\npointer to bitmap array that will contain result.\r\nint nmaskbits\r\nsize of bitmap, in bits.\r\nDescription\r\nWrapper for bitmap_parselist() , providing it with user buffer.\r\nWe cannot have this as an inline function in bitmap.h because it needs linux/uaccess.h to get the access_ok()\r\ndeclaration and this causes cyclic dependencies.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 42 of 228\n\nvoid bitmap_remap (unsigned long * dst, const unsigned long * src, const unsigned long * old, const unsigned\r\nlong * new, unsigned int nbits)¶\r\nApply map defined by a pair of bitmaps to another bitmap\r\nParameters\r\nunsigned long * dst\r\nremapped result\r\nconst unsigned long * src\r\nsubset to be remapped\r\nconst unsigned long * old\r\ndefines domain of map\r\nconst unsigned long * new\r\ndefines range of map\r\nunsigned int nbits\r\nnumber of bits in each of these bitmaps\r\nDescription\r\nLet old and new define a mapping of bit positions, such that whatever position is held by the n-th set bit in old is\r\nmapped to the n-th set bit in new. In the more general case, allowing for the possibility that the weight ‘w’ of new\r\nis less than the weight of old, map the position of the n-th set bit in old to the position of the m-th set bit in new,\r\nwhere m == n % w.\r\nIf either of the old and new bitmaps are empty, or if src and dst point to the same location, then this routine copies\r\nsrc to dst.\r\nThe positions of unset bits in old are mapped to themselves (the identify map).\r\nApply the above specified mapping to src, placing the result in dst, clearing any bits previously set in dst.\r\nFor example, lets say that old has bits 4 through 7 set, and new has bits 12 through 15 set. This defines the\r\nmapping of bit position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other bit positions unchanged. So if say src\r\ncomes into this routine with bits 1, 5 and 7 set, then dst should leave with bits 1, 13 and 15 set.\r\nint bitmap_bitremap (int oldbit, const unsigned long * old, const unsigned long * new, int bits)¶\r\nApply map defined by a pair of bitmaps to a single bit\r\nParameters\r\nint oldbit\r\nbit position to be mapped\r\nconst unsigned long * old\r\ndefines domain of map\r\nconst unsigned long * new\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 43 of 228\n\ndefines range of map\r\nint bits\r\nnumber of bits in each of these bitmaps\r\nDescription\r\nLet old and new define a mapping of bit positions, such that whatever position is held by the n-th set bit in old is\r\nmapped to the n-th set bit in new. In the more general case, allowing for the possibility that the weight ‘w’ of new\r\nis less than the weight of old, map the position of the n-th set bit in old to the position of the m-th set bit in new,\r\nwhere m == n % w.\r\nThe positions of unset bits in old are mapped to themselves (the identify map).\r\nApply the above specified mapping to bit position oldbit, returning the new bit position.\r\nFor example, lets say that old has bits 4 through 7 set, and new has bits 12 through 15 set. This defines the\r\nmapping of bit position 4 to 12, 5 to 13, 6 to 14 and 7 to 15, and of all other bit positions unchanged. So if say\r\noldbit is 5, then this routine returns 13.\r\nvoid bitmap_onto (unsigned long * dst, const unsigned long * orig, const unsigned long * relmap, unsigned\r\nint bits)¶\r\ntranslate one bitmap relative to another\r\nParameters\r\nunsigned long * dst\r\nresulting translated bitmap\r\nconst unsigned long * orig\r\noriginal untranslated bitmap\r\nconst unsigned long * relmap\r\nbitmap relative to which translated\r\nunsigned int bits\r\nnumber of bits in each of these bitmaps\r\nDescription\r\nSet the n-th bit of dst iff there exists some m such that the n-th bit of relmap is set, the m-th bit of orig is set, and\r\nthe n-th bit of relmap is also the m-th _set_ bit of relmap. (If you understood the previous sentence the first time\r\nyour read it, you’re overqualified for your current job.)\r\nIn other words, orig is mapped onto (surjectively) dst, using the map { \u003cn, m\u003e | the n-th bit of relmap is the m-th\r\nset bit of relmap }.\r\nAny set bits in orig above bit number W, where W is the weight of (number of set bits in) relmap are mapped\r\nnowhere. In particular, if for all bits m set in orig, m \u003e= W, then dst will end up empty. In situations where the\r\npossibility of such an empty result is not desired, one way to avoid it is to use the bitmap_fold() operator,\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 44 of 228\n\nbelow, to first fold the orig bitmap over itself so that all its set bits x are in the range 0 \u003c= x \u003c W. The\r\nbitmap_fold() operator does this by setting the bit (m % W) in dst, for each bit (m) set in orig.\r\nExample [1] for bitmap_onto() :\r\nLet’s say relmap has bits 30-39 set, and orig has bits 1, 3, 5, 7, 9 and 11 set. Then on return from this\r\nroutine, dst will have bits 31, 33, 35, 37 and 39 set.\r\nWhen bit 0 is set in orig, it means turn on the bit in dst corresponding to whatever is the first bit (if any)\r\nthat is turned on in relmap. Since bit 0 was off in the above example, we leave off that bit (bit 30) in dst.\r\nWhen bit 1 is set in orig (as in the above example), it means turn on the bit in dst corresponding to\r\nwhatever is the second bit that is turned on in relmap. The second bit in relmap that was turned on in the\r\nabove example was bit 31, so we turned on bit 31 in dst.\r\nSimilarly, we turned on bits 33, 35, 37 and 39 in dst, because they were the 4th, 6th, 8th and 10th set bits\r\nset in relmap, and the 4th, 6th, 8th and 10th bits of orig (i.e. bits 3, 5, 7 and 9) were also set.\r\nWhen bit 11 is set in orig, it means turn on the bit in dst corresponding to whatever is the twelfth bit that is\r\nturned on in relmap. In the above example, there were only ten bits turned on in relmap (30..39), so that\r\nbit 11 was set in orig had no affect on dst.\r\nExample [2] for bitmap_fold() + bitmap_onto() :\r\nLet’s say relmap has these ten bits set:\r\n40 41 42 43 45 48 53 61 74 95\r\n(for the curious, that’s 40 plus the first ten terms of the Fibonacci sequence.)\r\nFurther lets say we use the following code, invoking bitmap_fold() then bitmap_onto, as suggested\r\nabove to avoid the possibility of an empty dst result:\r\nunsigned long *tmp; // a temporary bitmap's bits\r\nbitmap_fold(tmp, orig, bitmap_weight(relmap, bits), bits);\r\nbitmap_onto(dst, tmp, relmap, bits);\r\nThen this table shows what various values of dst would be, for various orig‘s. I list the zero-based\r\npositions of each set bit. The tmp column shows the intermediate result, as computed by using\r\nbitmap_fold() to fold the orig bitmap modulo ten (the weight of relmap):\r\norig tmp dst\r\n0 0 40\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 45 of 228\n\n1 1 41\r\n9 9 95\r\n10 0 40 [1]\r\n1 3 5 7 1 3 5 7 41 43 48 61\r\n0 1 2 3 4 0 1 2 3 4 40 41 42 43 45\r\n0 9 18 27 0 9 8 7 40 61 74 95\r\n0 10 20 30 0 40\r\n0 11 22 33 0 1 2 3 40 41 42 43\r\n0 12 24 36 0 2 4 6 40 42 45 53\r\n78 102 211 1 2 8 41 42 74 [1]\r\n[1]\r\n(1, 2) For these marked lines, if we hadn’t first done bitmap_fold() into tmp, then the dst result\r\nwould have been empty.\r\nIf either of orig or relmap is empty (no set bits), then dst will be returned empty.\r\nIf (as explained above) the only set bits in orig are in positions m where m \u003e= W, (where W is the weight of\r\nrelmap) then dst will once again be returned empty.\r\nAll bits in dst not set by the above rule are cleared.\r\nvoid bitmap_fold (unsigned long * dst, const unsigned long * orig, unsigned int sz, unsigned int nbits)¶\r\nfold larger bitmap into smaller, modulo specified size\r\nParameters\r\nunsigned long * dst\r\nresulting smaller bitmap\r\nconst unsigned long * orig\r\noriginal larger bitmap\r\nunsigned int sz\r\nspecified size\r\nunsigned int nbits\r\nnumber of bits in each of these bitmaps\r\nDescription\r\nFor each bit oldbit in orig, set bit oldbit mod sz in dst. Clear all other bits in dst. See further the comment and\r\nExample [2] for bitmap_onto() for why and how to use this.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 46 of 228\n\nint bitmap_find_free_region (unsigned long * bitmap, unsigned int bits, int order)¶\r\nfind a contiguous aligned mem region\r\nParameters\r\nunsigned long * bitmap\r\narray of unsigned longs corresponding to the bitmap\r\nunsigned int bits\r\nnumber of bits in the bitmap\r\nint order\r\nregion size (log base 2 of number of bits) to find\r\nDescription\r\nFind a region of free (zero) bits in a bitmap of bits bits and allocate them (set them to one). Only consider regions\r\nof length a power (order) of two, aligned to that power of two, which makes the search algorithm much faster.\r\nReturn the bit offset in bitmap of the allocated region, or -errno on failure.\r\nvoid bitmap_release_region (unsigned long * bitmap, unsigned int pos, int order)¶\r\nrelease allocated bitmap region\r\nParameters\r\nunsigned long * bitmap\r\narray of unsigned longs corresponding to the bitmap\r\nunsigned int pos\r\nbeginning of bit region to release\r\nint order\r\nregion size (log base 2 of number of bits) to release\r\nDescription\r\nThis is the complement to __bitmap_find_free_region() and releases the found region (by clearing it in the\r\nbitmap).\r\nNo return value.\r\nint bitmap_allocate_region (unsigned long * bitmap, unsigned int pos, int order)¶\r\nallocate bitmap region\r\nParameters\r\nunsigned long * bitmap\r\narray of unsigned longs corresponding to the bitmap\r\nunsigned int pos\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 47 of 228\n\nbeginning of bit region to allocate\r\nint order\r\nregion size (log base 2 of number of bits) to allocate\r\nDescription\r\nAllocate (set bits in) a specified region of a bitmap.\r\nReturn 0 on success, or -EBUSY if specified region wasn’t free (not all bits were zero).\r\nunsigned int bitmap_from_u32array (unsigned long * bitmap, unsigned int nbits, const u32 * buf, unsigned\r\nint nwords)¶\r\ncopy the contents of a u32 array of bits to bitmap\r\nParameters\r\nunsigned long * bitmap\r\narray of unsigned longs, the destination bitmap, non NULL\r\nunsigned int nbits\r\nnumber of bits in bitmap\r\nconst u32 * buf\r\narray of u32 (in host byte order), the source bitmap, non NULL\r\nunsigned int nwords\r\nnumber of u32 words in buf\r\nDescription\r\ncopy min(nbits, 32*nwords) bits from buf to bitmap, remaining bits between nword and nbits in bitmap (if any)\r\nare cleared. In last word of bitmap, the bits beyond nbits (if any) are kept unchanged.\r\nReturn the number of bits effectively copied.\r\nunsigned int bitmap_to_u32array (u32 * buf, unsigned int nwords, const unsigned long * bitmap, unsigned\r\nint nbits)¶\r\ncopy the contents of bitmap to a u32 array of bits\r\nParameters\r\nu32 * buf\r\narray of u32 (in host byte order), the dest bitmap, non NULL\r\nunsigned int nwords\r\nnumber of u32 words in buf\r\nconst unsigned long * bitmap\r\narray of unsigned longs, the source bitmap, non NULL\r\nunsigned int nbits\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 48 of 228\n\nnumber of bits in bitmap\r\nDescription\r\ncopy min(nbits, 32*nwords) bits from bitmap to buf. Remaining bits after nbits in buf (if any) are cleared.\r\nReturn the number of bits effectively copied.\r\nvoid bitmap_copy_le (unsigned long * dst, const unsigned long * src, unsigned int nbits)¶\r\ncopy a bitmap, putting the bits into little-endian order.\r\nParameters\r\nunsigned long * dst\r\ndestination buffer\r\nconst unsigned long * src\r\nbitmap to copy\r\nunsigned int nbits\r\nnumber of bits in the bitmap\r\nDescription\r\nRequire nbits % BITS_PER_LONG == 0.\r\nint __bitmap_parselist (const char * buf, unsigned int buflen, int is_user, unsigned long * maskp,\r\nint nmaskbits)¶\r\nconvert list format ASCII string to bitmap\r\nParameters\r\nconst char * buf\r\nread nul-terminated user string from this buffer\r\nunsigned int buflen\r\nbuffer size in bytes. If string is smaller than this then it must be terminated with a 0.\r\nint is_user\r\nlocation of buffer, 0 indicates kernel space\r\nunsigned long * maskp\r\nwrite resulting mask here\r\nint nmaskbits\r\nnumber of bits in mask to be written\r\nDescription\r\nInput format is a comma-separated list of decimal numbers and ranges. Consecutively set bits are shown as two\r\nhyphen-separated decimal numbers, the smallest and largest bit numbers set in the range. Optionally each range\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 49 of 228\n\ncan be postfixed to denote that only parts of it should be set. The range will divided to groups of specific size.\r\nFrom each group will be used only defined amount of bits. Syntax: range:used_size/group_size\r\nExample\r\n0-1023:2/256 ==\u003e 0,1,256,257,512,513,768,769\r\nReturn\r\n0 on success, -errno on invalid input strings. Error values:\r\n-EINVAL : second number in range smaller than first\r\n-EINVAL : invalid character in string\r\n-ERANGE : bit number specified too large for mask\r\nint bitmap_pos_to_ord (const unsigned long * buf, unsigned int pos, unsigned int nbits)¶\r\nfind ordinal of set bit at given position in bitmap\r\nParameters\r\nconst unsigned long * buf\r\npointer to a bitmap\r\nunsigned int pos\r\na bit position in buf (0 \u003c= pos \u003c nbits)\r\nunsigned int nbits\r\nnumber of valid bit positions in buf\r\nDescription\r\nMap the bit at position pos in buf (of length nbits) to the ordinal of which set bit it is. If it is not set or if pos is\r\nnot a valid bit position, map to -1.\r\nIf for example, just bits 4 through 7 are set in buf, then pos values 4 through 7 will get mapped to 0 through 3,\r\nrespectively, and other pos values will get mapped to -1. When pos value 7 gets mapped to (returns) ord value 3\r\nin this example, that means that bit 7 is the 3rd (starting with 0th) set bit in buf.\r\nThe bit positions 0 through bits are valid positions in buf.\r\nunsigned int bitmap_ord_to_pos (const unsigned long * buf, unsigned int ord, unsigned int nbits)¶\r\nfind position of n-th set bit in bitmap\r\nParameters\r\nconst unsigned long * buf\r\npointer to bitmap\r\nunsigned int ord\r\nordinal bit position (n-th set bit, n \u003e= 0)\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 50 of 228\n\nunsigned int nbits\r\nnumber of valid bit positions in buf\r\nDescription\r\nMap the ordinal offset of bit ord in buf to its position in buf. Value of ord should be in range 0 \u003c= ord \u003c\r\nweight(buf). If ord \u003e= weight(buf), returns nbits.\r\nIf for example, just bits 4 through 7 are set in buf, then ord values 0 through 3 will get mapped to 4 through 7,\r\nrespectively, and all other ord values returns nbits. When ord value 3 gets mapped to (returns) pos value 7 in this\r\nexample, that means that the 3rd set bit (starting with 0th) is at position 7 in buf.\r\nThe bit positions 0 through nbits-1 are valid positions in buf.\r\nCommand-line Parsing¶\r\nint get_option (char ** str, int * pint)¶\r\nParse integer from an option string\r\nParameters\r\nchar ** str\r\noption string\r\nint * pint\r\n(output) integer value parsed from str\r\nDescription\r\nRead an int from an option string; if available accept a subsequent comma as well.\r\nReturn values: 0 - no int in string 1 - int found, no subsequent comma 2 - int found including a\r\nsubsequent comma 3 - hyphen found to denote a range\r\nchar * get_options (const char * str, int nints, int * ints)¶\r\nParse a string into a list of integers\r\nParameters\r\nconst char * str\r\nString to be parsed\r\nint nints\r\nsize of integer array\r\nint * ints\r\ninteger array\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 51 of 228\n\nThis function parses a string containing a comma-separated list of integers, a hyphen-separated range of\r\n_positive_ integers, or a combination of both. The parse halts when the array is full, or when no more\r\nnumbers can be retrieved from the string.\r\nReturn value is the character in the string which caused the parse to end (typically a null terminator, if\r\nstr is completely parseable).\r\nunsigned long long memparse (const char * ptr, char ** retptr)¶\r\nparse a string with mem suffixes into a number\r\nParameters\r\nconst char * ptr\r\nWhere parse begins\r\nchar ** retptr\r\n(output) Optional pointer to next char after parse completes\r\nDescription\r\nParses a string into a number. The number stored at ptr is potentially suffixed with K, M, G, T, P, E.\r\nCRC Functions¶\r\nu8 crc7_be (u8 crc, const u8 * buffer, size_t len)¶\r\nupdate the CRC7 for the data buffer\r\nParameters\r\nu8 crc\r\nprevious CRC7 value\r\nconst u8 * buffer\r\ndata pointer\r\nsize_t len\r\nnumber of bytes in the buffer\r\nContext\r\nany\r\nDescription\r\nReturns the updated CRC7 value. The CRC7 is left-aligned in the byte (the lsbit is always 0), as that makes the\r\ncomputation easier, and all callers want it in that form.\r\nu16 crc16 (u16 crc, u8 const * buffer, size_t len)¶\r\ncompute the CRC-16 for the data buffer\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 52 of 228\n\nParameters\r\nu16 crc\r\nprevious CRC value\r\nu8 const * buffer\r\ndata pointer\r\nsize_t len\r\nnumber of bytes in the buffer\r\nDescription\r\nReturns the updated CRC value.\r\nu16 crc_itu_t (u16 crc, const u8 * buffer, size_t len)¶\r\nCompute the CRC-ITU-T for the data buffer\r\nParameters\r\nu16 crc\r\nprevious CRC value\r\nconst u8 * buffer\r\ndata pointer\r\nsize_t len\r\nnumber of bytes in the buffer\r\nDescription\r\nReturns the updated CRC value\r\nu32 __pure crc32_le_generic (u32 crc, unsigned char const * p, size_t len, const u32 ( * tab, u32 polynomial)¶\r\nCalculate bitwise little-endian Ethernet AUTODIN II CRC32/CRC32C\r\nParameters\r\nu32 crc\r\nseed value for computation. ~0 for Ethernet, sometimes 0 for other uses, or the previous crc32/crc32c value\r\nif computing incrementally.\r\nunsigned char const * p\r\npointer to buffer over which CRC32/CRC32C is run\r\nsize_t len\r\nlength of buffer p\r\nconst u32 ( * tab\r\nlittle-endian Ethernet table\r\nu32 polynomial\r\nCRC32/CRC32c LE polynomial\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 53 of 228\n\nu32 __attribute_const__ crc32_generic_shift (u32 crc, size_t len, u32 polynomial)¶\r\nAppend len 0 bytes to crc, in logarithmic time\r\nParameters\r\nu32 crc\r\nThe original little-endian CRC (i.e. lsbit is x^31 coefficient)\r\nsize_t len\r\nThe number of bytes. crc is multiplied by x^(8***len**)\r\nu32 polynomial\r\nThe modulus used to reduce the result to 32 bits.\r\nDescription\r\nIt’s possible to parallelize CRC computations by computing a CRC over separate ranges of a buffer, then summing\r\nthem. This shifts the given CRC by 8*len bits (i.e. produces the same effect as appending len bytes of zero to the\r\ndata), in time proportional to log(len).\r\nu32 __pure crc32_be_generic (u32 crc, unsigned char const * p, size_t len, const u32 ( * tab, u32 polynomial)¶\r\nCalculate bitwise big-endian Ethernet AUTODIN II CRC32\r\nParameters\r\nu32 crc\r\nseed value for computation. ~0 for Ethernet, sometimes 0 for other uses, or the previous crc32 value if\r\ncomputing incrementally.\r\nunsigned char const * p\r\npointer to buffer over which CRC32 is run\r\nsize_t len\r\nlength of buffer p\r\nconst u32 ( * tab\r\nbig-endian Ethernet table\r\nu32 polynomial\r\nCRC32 BE polynomial\r\nu16 crc_ccitt (u16 crc, u8 const * buffer, size_t len)¶\r\nrecompute the CRC for the data buffer\r\nParameters\r\nu16 crc\r\nprevious CRC value\r\nu8 const * buffer\r\ndata pointer\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 54 of 228\n\nsize_t len\r\nnumber of bytes in the buffer\r\nidr/ida Functions¶\r\nidr synchronization (stolen from radix-tree.h)\r\nidr_find() is able to be called locklessly, using RCU. The caller must ensure calls to this function are made\r\nwithin rcu_read_lock() regions. Other readers (lock-free or otherwise) and modifications may be running\r\nconcurrently.\r\nIt is still required that the caller manage the synchronization and lifetimes of the items. So if RCU lock-free\r\nlookups are used, typically this would mean that the items have their own locks, or are amenable to lock-free\r\naccess; and that the items are freed by RCU (or only freed after having been deleted from the idr tree and a\r\nsynchronize_rcu() grace period).\r\nThe IDA is an ID allocator which does not provide the ability to associate an ID with a pointer. As such, it only\r\nneeds to store one bit per ID, and so is more space efficient than an IDR. To use an IDA, define it using\r\nDEFINE_IDA() (or embed a struct ida in a data structure, then initialise it using ida_init() ). To allocate a\r\nnew ID, call ida_simple_get() . To free an ID, call ida_simple_remove() .\r\nIf you have more complex locking requirements, use a loop around ida_pre_get() and ida_get_new() to\r\nallocate a new ID. Then use ida_remove() to free an ID. You must make sure that ida_get_new() and\r\nida_remove() cannot be called at the same time as each other for the same IDA.\r\nYou can also use ida_get_new_above() if you need an ID to be allocated above a particular number.\r\nida_destroy() can be used to dispose of an IDA without needing to free the individual IDs in it. You can use\r\nida_is_empty() to find out whether the IDA has any IDs currently allocated.\r\nIDs are currently limited to the range [0-INT_MAX]. If this is an awkward limitation, it should be quite\r\nstraightforward to raise the maximum.\r\nint idr_alloc (struct idr * idr, void * ptr, int start, int end, gfp_t gfp)¶\r\nallocate an id\r\nParameters\r\nstruct idr * idr\r\nidr handle\r\nvoid * ptr\r\npointer to be associated with the new id\r\nint start\r\nthe minimum id (inclusive)\r\nint end\r\nthe maximum id (exclusive)\r\ngfp_t gfp\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 55 of 228\n\nmemory allocation flags\r\nDescription\r\nAllocates an unused ID in the range [start, end). Returns -ENOSPC if there are no unused IDs in that range.\r\nNote that end is treated as max when \u003c= 0. This is to always allow using start + N as end as long as N is inside\r\ninteger range.\r\nSimultaneous modifications to the idr are not allowed and should be prevented by the user, usually with a lock.\r\nidr_alloc() may be called concurrently with read-only accesses to the idr, such as idr_find() and\r\nidr_for_each_entry() .\r\nint idr_alloc_cyclic (struct idr * idr, void * ptr, int start, int end, gfp_t gfp)¶\r\nallocate new idr entry in a cyclical fashion\r\nParameters\r\nstruct idr * idr\r\nidr handle\r\nvoid * ptr\r\npointer to be associated with the new id\r\nint start\r\nthe minimum id (inclusive)\r\nint end\r\nthe maximum id (exclusive)\r\ngfp_t gfp\r\nmemory allocation flags\r\nDescription\r\nAllocates an ID larger than the last ID allocated if one is available. If not, it will attempt to allocate the smallest ID\r\nthat is larger or equal to start.\r\nint idr_for_each (const struct idr * idr, int (*fn) (int id, void *p, void *data, void * data)¶\r\niterate through all stored pointers\r\nParameters\r\nconst struct idr * idr\r\nidr handle\r\nint (*)(int id, void *p, void *data) fn\r\nfunction to be called for each pointer\r\nvoid * data\r\ndata passed to callback function\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 56 of 228\n\nDescription\r\nThe callback function will be called for each entry in idr, passing the id, the pointer and the data pointer passed to\r\nthis function.\r\nIf fn returns anything other than 0 , the iteration stops and that value is returned from this function.\r\nidr_for_each() can be called concurrently with idr_alloc() and idr_remove() if protected by RCU. Newly\r\nadded entries may not be seen and deleted entries may be seen, but adding and removing entries will not cause\r\nother entries to be skipped, nor spurious ones to be seen.\r\nvoid * idr_get_next (struct idr * idr, int * nextid)¶\r\nFind next populated entry\r\nParameters\r\nstruct idr * idr\r\nidr handle\r\nint * nextid\r\nPointer to lowest possible ID to return\r\nDescription\r\nReturns the next populated entry in the tree with an ID greater than or equal to the value pointed to by nextid. On\r\nexit, nextid is updated to the ID of the found value. To use in a loop, the value pointed to by nextid must be\r\nincremented by the user.\r\nvoid * idr_replace (struct idr * idr, void * ptr, int id)¶\r\nreplace pointer for given id\r\nParameters\r\nstruct idr * idr\r\nidr handle\r\nvoid * ptr\r\nNew pointer to associate with the ID\r\nint id\r\nLookup key\r\nDescription\r\nReplace the pointer registered with an ID and return the old value. This function can be called under the RCU read\r\nlock concurrently with idr_alloc() and idr_remove() (as long as the ID being removed is not the one being\r\nreplaced!).\r\nReturn\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 57 of 228\n\n0 on success. -ENOENT indicates that id was not found. -EINVAL indicates that id or ptr were not valid.\r\nint ida_get_new_above (struct ida * ida, int start, int * id)¶\r\nallocate new ID above or equal to a start id\r\nParameters\r\nstruct ida * ida\r\nida handle\r\nint start\r\nid to start search at\r\nint * id\r\npointer to the allocated handle\r\nDescription\r\nAllocate new ID above or equal to start. It should be called with any required locks to ensure that concurrent calls\r\nto ida_get_new_above() / ida_get_new() / ida_remove() are not allowed. Consider using\r\nida_simple_get() if you do not have complex locking requirements.\r\nIf memory is required, it will return -EAGAIN , you should unlock and go back to the ida_pre_get() call. If the\r\nida is full, it will return -ENOSPC . On success, it will return 0.\r\nid returns a value in the range start ... 0x7fffffff .\r\nvoid ida_remove (struct ida * ida, int id)¶\r\nFree the given ID\r\nParameters\r\nstruct ida * ida\r\nida handle\r\nint id\r\nID to free\r\nDescription\r\nThis function should not be called at the same time as ida_get_new_above() .\r\nvoid ida_destroy (struct ida * ida)¶\r\nFree the contents of an ida\r\nParameters\r\nstruct ida * ida\r\nida handle\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 58 of 228\n\nDescription\r\nCalling this function releases all resources associated with an IDA. When this call returns, the IDA is empty and\r\ncan be reused or freed. The caller should not allow ida_remove() or ida_get_new_above() to be called at the\r\nsame time.\r\nint ida_simple_get (struct ida * ida, unsigned int start, unsigned int end, gfp_t gfp_mask)¶\r\nget a new id.\r\nParameters\r\nstruct ida * ida\r\nthe (initialized) ida.\r\nunsigned int start\r\nthe minimum id (inclusive, \u003c 0x8000000)\r\nunsigned int end\r\nthe maximum id (exclusive, \u003c 0x8000000 or 0)\r\ngfp_t gfp_mask\r\nmemory allocation flags\r\nDescription\r\nAllocates an id in the range start \u003c= id \u003c end, or returns -ENOSPC. On memory allocation failure, returns -\r\nENOMEM.\r\nCompared to ida_get_new_above() this function does its own locking, and should be used unless there are\r\nspecial requirements.\r\nUse ida_simple_remove() to get rid of an id.\r\nvoid ida_simple_remove (struct ida * ida, unsigned int id)¶\r\nremove an allocated id.\r\nParameters\r\nstruct ida * ida\r\nthe (initialized) ida.\r\nunsigned int id\r\nthe id returned by ida_simple_get.\r\nDescription\r\nUse to release an id allocated with ida_simple_get() .\r\nCompared to ida_remove() this function does its own locking, and should be used unless there are special\r\nrequirements.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 59 of 228\n\nMemory Management in Linux¶\r\nThe Slab Cache¶\r\nvoid * kmalloc (size_t size, gfp_t flags)¶\r\nallocate memory\r\nParameters\r\nsize_t size\r\nhow many bytes of memory are required.\r\ngfp_t flags\r\nthe type of memory to allocate.\r\nDescription\r\nkmalloc is the normal method of allocating memory for objects smaller than page size in the kernel.\r\nThe flags argument may be one of:\r\nGFP_USER - Allocate memory on behalf of user. May sleep.\r\nGFP_KERNEL - Allocate normal kernel ram. May sleep.\r\nGFP_ATOMIC - Allocation will not sleep. May use emergency pools.\r\nFor example, use this inside interrupt handlers.\r\nGFP_HIGHUSER - Allocate pages from high memory.\r\nGFP_NOIO - Do not do any I/O at all while trying to get memory.\r\nGFP_NOFS - Do not make any fs calls while trying to get memory.\r\nGFP_NOWAIT - Allocation will not sleep.\r\n__GFP_THISNODE - Allocate node-local memory only.\r\nGFP_DMA - Allocation suitable for DMA.\r\nShould only be used for kmalloc() caches. Otherwise, use a slab created with SLAB_DMA.\r\nAlso it is possible to set different flags by OR’ing in one or more of the following additional flags:\r\n__GFP_COLD - Request cache-cold pages instead of\r\ntrying to return cache-warm pages.\r\n__GFP_HIGH - This allocation has high priority and may use emergency pools.\r\n__GFP_NOFAIL - Indicate that this allocation is in no way allowed to fail\r\n(think twice before using).\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 60 of 228\n\n__GFP_NORETRY - If memory is not immediately available,\r\nthen give up at once.\r\n__GFP_NOWARN - If allocation fails, don’t issue any warnings.\r\n__GFP_REPEAT - If allocation fails initially, try once more before failing.\r\nThere are other flags available as well, but these are not intended for general use, and so are not documented here.\r\nFor a full list of potential flags, always refer to linux/gfp.h.\r\nvoid * kmalloc_array (size_t n, size_t size, gfp_t flags)¶\r\nallocate memory for an array.\r\nParameters\r\nsize_t n\r\nnumber of elements.\r\nsize_t size\r\nelement size.\r\ngfp_t flags\r\nthe type of memory to allocate (see kmalloc).\r\nvoid * kcalloc (size_t n, size_t size, gfp_t flags)¶\r\nallocate memory for an array. The memory is set to zero.\r\nParameters\r\nsize_t n\r\nnumber of elements.\r\nsize_t size\r\nelement size.\r\ngfp_t flags\r\nthe type of memory to allocate (see kmalloc).\r\nvoid * kzalloc (size_t size, gfp_t flags)¶\r\nallocate memory. The memory is set to zero.\r\nParameters\r\nsize_t size\r\nhow many bytes of memory are required.\r\ngfp_t flags\r\nthe type of memory to allocate (see kmalloc).\r\nvoid * kzalloc_node (size_t size, gfp_t flags, int node)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 61 of 228\n\nallocate zeroed memory from a particular memory node.\r\nParameters\r\nsize_t size\r\nhow many bytes of memory are required.\r\ngfp_t flags\r\nthe type of memory to allocate (see kmalloc).\r\nint node\r\nmemory node from which to allocate\r\nvoid * kmem_cache_alloc (struct kmem_cache * cachep, gfp_t flags)¶\r\nAllocate an object\r\nParameters\r\nstruct kmem_cache * cachep\r\nThe cache to allocate from.\r\ngfp_t flags\r\nSee kmalloc() .\r\nDescription\r\nAllocate an object from this cache. The flags are only relevant if the cache has no available objects.\r\nvoid * kmem_cache_alloc_node (struct kmem_cache * cachep, gfp_t flags, int nodeid)¶\r\nAllocate an object on the specified node\r\nParameters\r\nstruct kmem_cache * cachep\r\nThe cache to allocate from.\r\ngfp_t flags\r\nSee kmalloc() .\r\nint nodeid\r\nnode number of the target node.\r\nDescription\r\nIdentical to kmem_cache_alloc but it will allocate memory on the given node, which can improve the performance\r\nfor cpu bound structures.\r\nFallback to other node is possible if __GFP_THISNODE is not set.\r\nvoid kmem_cache_free (struct kmem_cache * cachep, void * objp)¶\r\nDeallocate an object\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 62 of 228\n\nParameters\r\nstruct kmem_cache * cachep\r\nThe cache the allocation was from.\r\nvoid * objp\r\nThe previously allocated object.\r\nDescription\r\nFree an object which was previously allocated from this cache.\r\nvoid kfree (const void * objp)¶\r\nfree previously allocated memory\r\nParameters\r\nconst void * objp\r\npointer returned by kmalloc.\r\nDescription\r\nIf objp is NULL, no operation is performed.\r\nDon’t free memory not originally allocated by kmalloc() or you will run into trouble.\r\nsize_t ksize (const void * objp)¶\r\nget the actual amount of memory allocated for a given object\r\nParameters\r\nconst void * objp\r\nPointer to the object\r\nDescription\r\nkmalloc may internally round up allocations and return more memory than requested. ksize() can be used to\r\ndetermine the actual amount of memory allocated. The caller may use this additional memory, even though a\r\nsmaller amount of memory was initially specified with the kmalloc call. The caller must guarantee that objp points\r\nto a valid object previously allocated with either kmalloc() or kmem_cache_alloc() . The object must not be\r\nfreed during the duration of the call.\r\nvoid kfree_const (const void * x)¶\r\nconditionally free memory\r\nParameters\r\nconst void * x\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 63 of 228\n\npointer to the memory\r\nDescription\r\nFunction calls kfree only if x is not in .rodata section.\r\nchar * kstrdup (const char * s, gfp_t gfp)¶\r\nallocate space for and copy an existing string\r\nParameters\r\nconst char * s\r\nthe string to duplicate\r\ngfp_t gfp\r\nthe GFP mask used in the kmalloc() call when allocating memory\r\nconst char * kstrdup_const (const char * s, gfp_t gfp)¶\r\nconditionally duplicate an existing const string\r\nParameters\r\nconst char * s\r\nthe string to duplicate\r\ngfp_t gfp\r\nthe GFP mask used in the kmalloc() call when allocating memory\r\nDescription\r\nFunction returns source string if it is in .rodata section otherwise it fallbacks to kstrdup. Strings allocated by\r\nkstrdup_const should be freed by kfree_const.\r\nchar * kstrndup (const char * s, size_t max, gfp_t gfp)¶\r\nallocate space for and copy an existing string\r\nParameters\r\nconst char * s\r\nthe string to duplicate\r\nsize_t max\r\nread at most max chars from s\r\ngfp_t gfp\r\nthe GFP mask used in the kmalloc() call when allocating memory\r\nvoid * kmemdup (const void * src, size_t len, gfp_t gfp)¶\r\nduplicate region of memory\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 64 of 228\n\nParameters\r\nconst void * src\r\nmemory region to duplicate\r\nsize_t len\r\nmemory region length\r\ngfp_t gfp\r\nGFP mask to use\r\nvoid * memdup_user (const void __user * src, size_t len)¶\r\nduplicate memory region from user space\r\nParameters\r\nconst void __user * src\r\nsource address in user space\r\nsize_t len\r\nnumber of bytes to copy\r\nDescription\r\nReturns an ERR_PTR() on failure.\r\nvoid * memdup_user_nul (const void __user * src, size_t len)¶\r\nduplicate memory region from user space and NUL-terminate\r\nParameters\r\nconst void __user * src\r\nsource address in user space\r\nsize_t len\r\nnumber of bytes to copy\r\nDescription\r\nReturns an ERR_PTR() on failure.\r\nint get_user_pages_fast (unsigned long start, int nr_pages, int write, struct page ** pages)¶\r\npin user pages in memory\r\nParameters\r\nunsigned long start\r\nstarting user address\r\nint nr_pages\r\nnumber of pages from start to pin\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 65 of 228\n\nint write\r\nwhether pages will be written to\r\nstruct page ** pages\r\narray that receives pointers to the pages pinned. Should be at least nr_pages long.\r\nDescription\r\nReturns number of pages pinned. This may be fewer than the number requested. If nr_pages is 0 or negative,\r\nreturns 0. If no pages were pinned, returns -errno.\r\nget_user_pages_fast provides equivalent functionality to get_user_pages, operating on current and current-\u003emm,\r\nwith force=0 and vma=NULL. However unlike get_user_pages, it must be called without mmap_sem held.\r\nget_user_pages_fast may take mmap_sem and page table locks, so no assumptions can be made about lack of\r\nlocking. get_user_pages_fast is to be implemented in a way that is advantageous (vs get_user_pages() ) when\r\nthe user memory area is already faulted in and present in ptes. However if the pages have to be faulted in, it may\r\nturn out to be slightly slower so callers need to carefully consider what to use. On many architectures,\r\nget_user_pages_fast simply falls back to get_user_pages.\r\nvoid * kvmalloc_node (size_t size, gfp_t flags, int node)¶\r\nattempt to allocate physically contiguous memory, but upon failure, fall back to non-contiguous (vmalloc)\r\nallocation.\r\nParameters\r\nsize_t size\r\nsize of the request.\r\ngfp_t flags\r\ngfp mask for the allocation - must be compatible (superset) with GFP_KERNEL.\r\nint node\r\nnuma node to allocate from\r\nDescription\r\nUses kmalloc to get the memory but if the allocation fails then falls back to the vmalloc allocator. Use kvfree for\r\nfreeing the memory.\r\nReclaim modifiers - __GFP_NORETRY and __GFP_NOFAIL are not supported. __GFP_REPEAT is supported\r\nonly for large (\u003e32kB) allocations, and it should be used only if kmalloc is preferable to the vmalloc fallback, due\r\nto visible performance drawbacks.\r\nAny use of gfp flags outside of GFP_KERNEL should be consulted with mm people.\r\nUser Space Memory Access¶\r\nunsigned long clear_user (void __user * to, unsigned long n)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 66 of 228\n\nZero a block of memory in user space.\r\nParameters\r\nvoid __user * to\r\nDestination address, in user space.\r\nunsigned long n\r\nNumber of bytes to zero.\r\nDescription\r\nZero a block of memory in user space.\r\nReturns number of bytes that could not be cleared. On success, this will be zero.\r\nunsigned long __clear_user (void __user * to, unsigned long n)¶\r\nZero a block of memory in user space, with less checking.\r\nParameters\r\nvoid __user * to\r\nDestination address, in user space.\r\nunsigned long n\r\nNumber of bytes to zero.\r\nDescription\r\nZero a block of memory in user space. Caller must check the specified block with access_ok() before calling\r\nthis function.\r\nReturns number of bytes that could not be cleared. On success, this will be zero.\r\nMore Memory Management Functions¶\r\nint read_cache_pages (struct address_space * mapping, struct list_head * pages, int (*filler) (void *, struct\r\npage *, void * data)¶\r\npopulate an address space with some pages \u0026 start reads against them\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space\r\nstruct list_head * pages\r\nThe address of a list_head which contains the target pages. These pages have their -\u003eindex populated and\r\nare otherwise uninitialised.\r\nint (*)(void *, struct page *) filler\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 67 of 228\n\ncallback routine for filling a single page.\r\nvoid * data\r\nprivate data for the callback routine.\r\nDescription\r\nHides the details of the LRU cache etc from the filesystems.\r\nvoid page_cache_sync_readahead (struct address_space * mapping, struct file_ra_state * ra, struct file * filp,\r\npgoff_t offset, unsigned long req_size)¶\r\ngeneric file readahead\r\nParameters\r\nstruct address_space * mapping\r\naddress_space which holds the pagecache and I/O vectors\r\nstruct file_ra_state * ra\r\nfile_ra_state which holds the readahead state\r\nstruct file * filp\r\npassed on to -\u003e:c:func:readpage() and -\u003e:c:func:readpages()\r\npgoff_t offset\r\nstart offset into mapping, in pagecache page-sized units\r\nunsigned long req_size\r\nhint: total size of the read which the caller is performing in pagecache pages\r\nDescription\r\npage_cache_sync_readahead() should be called when a cache miss happened: it will submit the read. The\r\nreadahead logic may decide to piggyback more pages onto the read request if access patterns suggest it will\r\nimprove performance.\r\nvoid page_cache_async_readahead (struct address_space * mapping, struct file_ra_state * ra, struct file * filp,\r\nstruct page * page, pgoff_t offset, unsigned long req_size)¶\r\nfile readahead for marked pages\r\nParameters\r\nstruct address_space * mapping\r\naddress_space which holds the pagecache and I/O vectors\r\nstruct file_ra_state * ra\r\nfile_ra_state which holds the readahead state\r\nstruct file * filp\r\npassed on to -\u003e:c:func:readpage() and -\u003e:c:func:readpages()\r\nstruct page * page\r\nthe page at offset which has the PG_readahead flag set\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 68 of 228\n\npgoff_t offset\r\nstart offset into mapping, in pagecache page-sized units\r\nunsigned long req_size\r\nhint: total size of the read which the caller is performing in pagecache pages\r\nDescription\r\npage_cache_async_readahead() should be called when a page is used which has the PG_readahead flag; this is a\r\nmarker to suggest that the application has used up enough of the readahead window that we should start pulling in\r\nmore pages.\r\nvoid delete_from_page_cache (struct page * page)¶\r\ndelete page from page cache\r\nParameters\r\nstruct page * page\r\nthe page which the kernel is trying to remove from page cache\r\nDescription\r\nThis must be called only on pages that have been verified to be in the page cache and locked. It will never put the\r\npage into the free list, the caller has a reference on the page.\r\nint filemap_flush (struct address_space * mapping)¶\r\nmostly a non-blocking flush\r\nParameters\r\nstruct address_space * mapping\r\ntarget address_space\r\nDescription\r\nThis is a mostly non-blocking flush. Not suitable for data-integrity purposes - I/O may not be started against all\r\ndirty pages.\r\nint filemap_fdatawait_range (struct address_space * mapping, loff_t start_byte, loff_t end_byte)¶\r\nwait for writeback to complete\r\nParameters\r\nstruct address_space * mapping\r\naddress space structure to wait for\r\nloff_t start_byte\r\noffset in bytes where the range starts\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 69 of 228\n\nloff_t end_byte\r\noffset in bytes where the range ends (inclusive)\r\nDescription\r\nWalk the list of under-writeback pages of the given address space in the given range and wait for all of them.\r\nCheck error status of the address space and return it.\r\nSince the error status of the address space is cleared by this function, callers are responsible for checking the\r\nreturn value and handling and/or reporting the error.\r\nint filemap_fdatawait (struct address_space * mapping)¶\r\nwait for all under-writeback pages to complete\r\nParameters\r\nstruct address_space * mapping\r\naddress space structure to wait for\r\nDescription\r\nWalk the list of under-writeback pages of the given address space and wait for all of them. Check error status of\r\nthe address space and return it.\r\nSince the error status of the address space is cleared by this function, callers are responsible for checking the\r\nreturn value and handling and/or reporting the error.\r\nint filemap_write_and_wait_range (struct address_space * mapping, loff_t lstart, loff_t lend)¶\r\nwrite out \u0026 wait on a file range\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space for the pages\r\nloff_t lstart\r\noffset in bytes where the range starts\r\nloff_t lend\r\noffset in bytes where the range ends (inclusive)\r\nDescription\r\nWrite out and wait upon file offsets lstart-\u003elend, inclusive.\r\nNote that lend is inclusive (describes the last byte to be written) so that this function can be used to write to the\r\nvery end-of-file (end = -1).\r\nint replace_page_cache_page (struct page * old, struct page * new, gfp_t gfp_mask)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 70 of 228\n\nreplace a pagecache page with a new one\r\nParameters\r\nstruct page * old\r\npage to be replaced\r\nstruct page * new\r\npage to replace with\r\ngfp_t gfp_mask\r\nallocation mode\r\nDescription\r\nThis function replaces a page in the pagecache with a new one. On success it acquires the pagecache reference for\r\nthe new page and drops it for the old page. Both the old and new pages must be locked. This function does not add\r\nthe new page to the LRU, the caller must do that.\r\nThe remove + add is atomic. The only way this function can fail is memory allocation failure.\r\nint add_to_page_cache_locked (struct page * page, struct address_space * mapping, pgoff_t offset,\r\ngfp_t gfp_mask)¶\r\nadd a locked page to the pagecache\r\nParameters\r\nstruct page * page\r\npage to add\r\nstruct address_space * mapping\r\nthe page’s address_space\r\npgoff_t offset\r\npage index\r\ngfp_t gfp_mask\r\npage allocation mode\r\nDescription\r\nThis function is used to add a page to the pagecache. It must be locked. This function does not add the page to the\r\nLRU. The caller must do that.\r\nvoid add_page_wait_queue (struct page * page, wait_queue_t * waiter)¶\r\nAdd an arbitrary waiter to a page’s wait queue\r\nParameters\r\nstruct page * page\r\nPage defining the wait queue of interest\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 71 of 228\n\nwait_queue_t * waiter\r\nWaiter to add to the queue\r\nDescription\r\nAdd an arbitrary waiter to the wait queue for the nominated page.\r\nvoid unlock_page (struct page * page)¶\r\nunlock a locked page\r\nParameters\r\nstruct page * page\r\nthe page\r\nDescription\r\nUnlocks the page and wakes up sleepers in ___wait_on_page_locked() . Also wakes sleepers in\r\nwait_on_page_writeback() because the wakeup mechanism between PageLocked pages and PageWriteback\r\npages is shared. But that’s OK - sleepers in wait_on_page_writeback() just go back to sleep.\r\nNote that this depends on PG_waiters being the sign bit in the byte that contains PG_locked - thus the\r\nBUILD_BUG_ON() . That allows us to clear the PG_locked bit and test PG_waiters at the same time fairly portably\r\n(architectures that do LL/SC can test any bit, while x86 can test the sign bit).\r\nvoid end_page_writeback (struct page * page)¶\r\nend writeback against a page\r\nParameters\r\nstruct page * page\r\nthe page\r\nvoid __lock_page (struct page * __page)¶\r\nget a lock on the page, assuming we need to sleep to get it\r\nParameters\r\nstruct page * __page\r\nthe page to lock\r\npgoff_t page_cache_next_hole (struct address_space * mapping, pgoff_t index, unsigned long max_scan)¶\r\nfind the next hole (not-present entry)\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 72 of 228\n\nstruct address_space * mapping\r\nmapping\r\npgoff_t index\r\nindex\r\nunsigned long max_scan\r\nmaximum range to search\r\nDescription\r\nSearch the set [index, min(index+max_scan-1, MAX_INDEX)] for the lowest indexed hole.\r\nReturn\r\nthe index of the hole if found, otherwise returns an index outside of the set specified (in which case ‘return - index\r\n\u003e= max_scan’ will be true). In rare cases of index wrap-around, 0 will be returned.\r\npage_cache_next_hole may be called under rcu_read_lock. However, like radix_tree_gang_lookup, this will not\r\natomically search a snapshot of the tree at a single point in time. For example, if a hole is created at index 5, then\r\nsubsequently a hole is created at index 10, page_cache_next_hole covering both indexes may return 10 if called\r\nunder rcu_read_lock.\r\npgoff_t page_cache_prev_hole (struct address_space * mapping, pgoff_t index, unsigned long max_scan)¶\r\nfind the prev hole (not-present entry)\r\nParameters\r\nstruct address_space * mapping\r\nmapping\r\npgoff_t index\r\nindex\r\nunsigned long max_scan\r\nmaximum range to search\r\nDescription\r\nSearch backwards in the range [max(index-max_scan+1, 0), index] for the first hole.\r\nReturn\r\nthe index of the hole if found, otherwise returns an index outside of the set specified (in which case ‘index - return\r\n\u003e= max_scan’ will be true). In rare cases of wrap-around, ULONG_MAX will be returned.\r\npage_cache_prev_hole may be called under rcu_read_lock. However, like radix_tree_gang_lookup, this will not\r\natomically search a snapshot of the tree at a single point in time. For example, if a hole is created at index 10, then\r\nsubsequently a hole is created at index 5, page_cache_prev_hole covering both indexes may return 5 if called\r\nunder rcu_read_lock.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 73 of 228\n\nstruct page * find_get_entry (struct address_space * mapping, pgoff_t offset)¶\r\nfind and get a page cache entry\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space to search\r\npgoff_t offset\r\nthe page cache index\r\nDescription\r\nLooks up the page cache slot at mapping \u0026 offset. If there is a page cache page, it is returned with an increased\r\nrefcount.\r\nIf the slot holds a shadow entry of a previously evicted page, or a swap entry from shmem/tmpfs, it is returned.\r\nOtherwise, NULL is returned.\r\nstruct page * find_lock_entry (struct address_space * mapping, pgoff_t offset)¶\r\nlocate, pin and lock a page cache entry\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space to search\r\npgoff_t offset\r\nthe page cache index\r\nDescription\r\nLooks up the page cache slot at mapping \u0026 offset. If there is a page cache page, it is returned locked and with an\r\nincreased refcount.\r\nIf the slot holds a shadow entry of a previously evicted page, or a swap entry from shmem/tmpfs, it is returned.\r\nOtherwise, NULL is returned.\r\nfind_lock_entry() may sleep.\r\nstruct page * pagecache_get_page (struct address_space * mapping, pgoff_t offset, int fgp_flags,\r\ngfp_t gfp_mask)¶\r\nfind and get a page reference\r\nParameters\r\nstruct address_space * mapping\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 74 of 228\n\nthe address_space to search\r\npgoff_t offset\r\nthe page index\r\nint fgp_flags\r\nPCG flags\r\ngfp_t gfp_mask\r\ngfp mask to use for the page cache data page allocation\r\nDescription\r\nLooks up the page cache slot at mapping \u0026 offset.\r\nPCG flags modify how the page is returned.\r\nfgp_flags can be:\r\nFGP_ACCESSED: the page will be marked accessed\r\nFGP_LOCK: Page is return locked\r\nFGP_CREAT: If page is not present then a new page is allocated using gfp_mask and added to the page\r\ncache and the VM’s LRU list. The page is returned locked and with an increased refcount. Otherwise,\r\nNULL is returned.\r\nIf FGP_LOCK or FGP_CREAT are specified then the function may sleep even if the GFP flags specified for\r\nFGP_CREAT are atomic.\r\nIf there is a page cache page, it is returned with an increased refcount.\r\nunsigned find_get_pages_contig (struct address_space * mapping, pgoff_t index, unsigned int nr_pages, struct\r\npage ** pages)¶\r\ngang contiguous pagecache lookup\r\nParameters\r\nstruct address_space * mapping\r\nThe address_space to search\r\npgoff_t index\r\nThe starting page index\r\nunsigned int nr_pages\r\nThe maximum number of pages\r\nstruct page ** pages\r\nWhere the resulting pages are placed\r\nDescription\r\nfind_get_pages_contig() works exactly like find_get_pages() , except that the returned number of pages are\r\nguaranteed to be contiguous.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 75 of 228\n\nfind_get_pages_contig() returns the number of pages which were found.\r\nunsigned find_get_pages_tag (struct address_space * mapping, pgoff_t * index, int tag, unsigned int nr_pages,\r\nstruct page ** pages)¶\r\nfind and return pages that match tag\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space to search\r\npgoff_t * index\r\nthe starting page index\r\nint tag\r\nthe tag index\r\nunsigned int nr_pages\r\nthe maximum number of pages\r\nstruct page ** pages\r\nwhere the resulting pages are placed\r\nDescription\r\nLike find_get_pages, except we only return pages which are tagged with tag. We update index to index the next\r\npage for the traversal.\r\nunsigned find_get_entries_tag (struct address_space * mapping, pgoff_t start, int tag, unsigned int nr_entries,\r\nstruct page ** entries, pgoff_t * indices)¶\r\nfind and return entries that match tag\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space to search\r\npgoff_t start\r\nthe starting page cache index\r\nint tag\r\nthe tag index\r\nunsigned int nr_entries\r\nthe maximum number of entries\r\nstruct page ** entries\r\nwhere the resulting entries are placed\r\npgoff_t * indices\r\nthe cache indices corresponding to the entries in entries\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 76 of 228\n\nLike find_get_entries, except we only return entries which are tagged with tag.\r\nssize_t generic_file_read_iter (struct kiocb * iocb, struct iov_iter * iter)¶\r\ngeneric filesystem read routine\r\nParameters\r\nstruct kiocb * iocb\r\nkernel I/O control block\r\nstruct iov_iter * iter\r\ndestination for the data read\r\nDescription\r\nThis is the “ read_iter() ” routine for all filesystems that can use the page cache directly.\r\nint filemap_fault (struct vm_fault * vmf)¶\r\nread in file data for page fault handling\r\nParameters\r\nstruct vm_fault * vmf\r\nstruct vm_fault containing details of the fault\r\nDescription\r\nfilemap_fault() is invoked via the vma operations vector for a mapped memory region to read in file data\r\nduring a page fault.\r\nThe goto’s are kind of ugly, but this streamlines the normal case of having it in the page cache, and handles the\r\nspecial cases reasonably without having a lot of duplicated code.\r\nvma-\u003evm_mm-\u003emmap_sem must be held on entry.\r\nIf our return value has VM_FAULT_RETRY set, it’s because lock_page_or_retry() returned 0. The mmap_sem\r\nhas usually been released in this case. See __lock_page_or_retry() for the exception.\r\nIf our return value does not have VM_FAULT_RETRY set, the mmap_sem has not been released.\r\nWe never return with VM_FAULT_RETRY and a bit from VM_FAULT_ERROR set.\r\nstruct page * read_cache_page (struct address_space * mapping, pgoff_t index, int (*filler) (void *, struct page *,\r\nvoid * data)¶\r\nread into page cache, fill it if needed\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 77 of 228\n\nstruct address_space * mapping\r\nthe page’s address_space\r\npgoff_t index\r\nthe page index\r\nint (*)(void *, struct page *) filler\r\nfunction to perform the read\r\nvoid * data\r\nfirst arg to filler(data, page) function, often left as NULL\r\nDescription\r\nRead into the page cache. If a page already exists, and PageUptodate() is not set, try to fill the page and wait for\r\nit to become unlocked.\r\nIf the page does not get brought uptodate, return -EIO.\r\nstruct page * read_cache_page_gfp (struct address_space * mapping, pgoff_t index, gfp_t gfp)¶\r\nread into page cache, using specified page allocation flags.\r\nParameters\r\nstruct address_space * mapping\r\nthe page’s address_space\r\npgoff_t index\r\nthe page index\r\ngfp_t gfp\r\nthe page allocator flags to use if allocating\r\nDescription\r\nThis is the same as “read_mapping_page(mapping, index, NULL)”, but with any new page allocations done using\r\nthe specified allocation flags.\r\nIf the page does not get brought uptodate, return -EIO.\r\nssize_t __generic_file_write_iter (struct kiocb * iocb, struct iov_iter * from)¶\r\nwrite data to a file\r\nParameters\r\nstruct kiocb * iocb\r\nIO state structure (file, offset, etc.)\r\nstruct iov_iter * from\r\niov_iter with data to write\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 78 of 228\n\nThis function does all the work needed for actually writing data to a file. It does all basic checks, removes SUID\r\nfrom the file, updates modification times and calls proper subroutines depending on whether we do direct IO or a\r\nstandard buffered write.\r\nIt expects i_mutex to be grabbed unless we work on a block device or similar object which does not need locking\r\nat all.\r\nThis function does not take care of syncing data in case of O_SYNC write. A caller has to handle it. This is mainly\r\ndue to the fact that we want to avoid syncing under i_mutex.\r\nssize_t generic_file_write_iter (struct kiocb * iocb, struct iov_iter * from)¶\r\nwrite data to a file\r\nParameters\r\nstruct kiocb * iocb\r\nIO state structure\r\nstruct iov_iter * from\r\niov_iter with data to write\r\nDescription\r\nThis is a wrapper around __generic_file_write_iter() to be used by most filesystems. It takes care of syncing\r\nthe file in case of O_SYNC file and acquires i_mutex as needed.\r\nint try_to_release_page (struct page * page, gfp_t gfp_mask)¶\r\nrelease old fs-specific metadata on a page\r\nParameters\r\nstruct page * page\r\nthe page which the kernel is trying to free\r\ngfp_t gfp_mask\r\nmemory allocation flags (and I/O mode)\r\nDescription\r\nThe address_space is to try to release any data against the page (presumably at page-\u003eprivate). If the release was\r\nsuccessful, return ‘1’. Otherwise return zero.\r\nThis may also be called if PG_fscache is set on a page, indicating that the page is known to the local caching\r\nroutines.\r\nThe gfp_mask argument specifies whether I/O may be performed to release this page (__GFP_IO), and whether\r\nthe call may block (__GFP_RECLAIM \u0026 __GFP_FS).\r\nint zap_vma_ptes (struct vm_area_struct * vma, unsigned long address, unsigned long size)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 79 of 228\n\nremove ptes mapping the vma\r\nParameters\r\nstruct vm_area_struct * vma\r\nvm_area_struct holding ptes to be zapped\r\nunsigned long address\r\nstarting address of pages to zap\r\nunsigned long size\r\nnumber of bytes to zap\r\nDescription\r\nThis function only unmaps ptes assigned to VM_PFNMAP vmas.\r\nThe entire address range must be fully contained within the vma.\r\nReturns 0 if successful.\r\nint vm_insert_page (struct vm_area_struct * vma, unsigned long addr, struct page * page)¶\r\ninsert single page into user vma\r\nParameters\r\nstruct vm_area_struct * vma\r\nuser vma to map to\r\nunsigned long addr\r\ntarget user address of this page\r\nstruct page * page\r\nsource kernel page\r\nDescription\r\nThis allows drivers to insert individual pages they’ve allocated into a user vma.\r\nThe page has to be a nice clean _individual_ kernel allocation. If you allocate a compound page, you need to have\r\nmarked it as such (__GFP_COMP), or manually just split the page up yourself (see split_page() ).\r\nNOTE! Traditionally this was done with “ remap_pfn_range() ” which took an arbitrary page protection\r\nparameter. This doesn’t allow that. Your vma protection will have to be set up correctly, which means that if you\r\nwant a shared writable mapping, you’d better ask for a shared writable mapping!\r\nThe page does not need to be reserved.\r\nUsually this function is called from f_op-\u003e:c:func:mmap() handler under mm-\u003emmap_sem write-lock, so it can\r\nchange vma-\u003evm_flags. Caller must set VM_MIXEDMAP on vma if it wants to call this function from other\r\nplaces, for example from page-fault handler.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 80 of 228\n\nint vm_insert_pfn (struct vm_area_struct * vma, unsigned long addr, unsigned long pfn)¶\r\ninsert single pfn into user vma\r\nParameters\r\nstruct vm_area_struct * vma\r\nuser vma to map to\r\nunsigned long addr\r\ntarget user address of this page\r\nunsigned long pfn\r\nsource kernel pfn\r\nDescription\r\nSimilar to vm_insert_page, this allows drivers to insert individual pages they’ve allocated into a user vma. Same\r\ncomments apply.\r\nThis function should only be called from a vm_ops-\u003efault handler, and in that case the handler should return\r\nNULL.\r\nvma cannot be a COW mapping.\r\nAs this is called only for pages that do not currently exist, we do not need to flush old virtual caches or the TLB.\r\nint vm_insert_pfn_prot (struct vm_area_struct * vma, unsigned long addr, unsigned long pfn, pgprot_t pgprot)¶\r\ninsert single pfn into user vma with specified pgprot\r\nParameters\r\nstruct vm_area_struct * vma\r\nuser vma to map to\r\nunsigned long addr\r\ntarget user address of this page\r\nunsigned long pfn\r\nsource kernel pfn\r\npgprot_t pgprot\r\npgprot flags for the inserted page\r\nDescription\r\nThis is exactly like vm_insert_pfn, except that it allows drivers to to override pgprot on a per-page basis.\r\nThis only makes sense for IO mappings, and it makes no sense for cow mappings. In general, using multiple vmas\r\nis preferable; vm_insert_pfn_prot should only be used if using multiple VMAs is impractical.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 81 of 228\n\nint remap_pfn_range (struct vm_area_struct * vma, unsigned long addr, unsigned long pfn, unsigned long size,\r\npgprot_t prot)¶\r\nremap kernel memory to userspace\r\nParameters\r\nstruct vm_area_struct * vma\r\nuser vma to map to\r\nunsigned long addr\r\ntarget user address to start at\r\nunsigned long pfn\r\nphysical address of kernel memory\r\nunsigned long size\r\nsize of map area\r\npgprot_t prot\r\npage protection flags for this mapping\r\nNote\r\nthis is only safe if the mm semaphore is held when called.\r\nint vm_iomap_memory (struct vm_area_struct * vma, phys_addr_t start, unsigned long len)¶\r\nremap memory to userspace\r\nParameters\r\nstruct vm_area_struct * vma\r\nuser vma to map to\r\nphys_addr_t start\r\nstart of area\r\nunsigned long len\r\nsize of area\r\nDescription\r\nThis is a simplified io_remap_pfn_range() for common driver use. The driver just needs to give us the physical\r\nmemory range to be mapped, we’ll figure out the rest from the vma information.\r\nNOTE! Some drivers might want to tweak vma-\u003evm_page_prot first to get whatever write-combining details or\r\nsimilar.\r\nvoid unmap_mapping_range (struct address_space * mapping, loff_t const holebegin, loff_t const holelen,\r\nint even_cows)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 82 of 228\n\nunmap the portion of all mmaps in the specified address_space corresponding to the specified page range\r\nin the underlying file.\r\nParameters\r\nstruct address_space * mapping\r\nthe address space containing mmaps to be unmapped.\r\nloff_t const holebegin\r\nbyte in first page to unmap, relative to the start of the underlying file. This will be rounded down to a\r\nPAGE_SIZE boundary. Note that this is different from truncate_pagecache() , which must keep the\r\npartial page. In contrast, we must get rid of partial pages.\r\nloff_t const holelen\r\nsize of prospective hole in bytes. This will be rounded up to a PAGE_SIZE boundary. A holelen of zero\r\ntruncates to the end of the file.\r\nint even_cows\r\n1 when truncating a file, unmap even private COWed pages; but 0 when invalidating pagecache, don’t\r\nthrow away private data.\r\nint follow_pfn (struct vm_area_struct * vma, unsigned long address, unsigned long * pfn)¶\r\nlook up PFN at a user virtual address\r\nParameters\r\nstruct vm_area_struct * vma\r\nmemory mapping\r\nunsigned long address\r\nuser virtual address\r\nunsigned long * pfn\r\nlocation to store found PFN\r\nDescription\r\nOnly IO mappings and raw PFN mappings are allowed.\r\nReturns zero and the pfn at pfn on success, -ve otherwise.\r\nvoid vm_unmap_aliases (void)¶\r\nunmap outstanding lazy aliases in the vmap layer\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 83 of 228\n\nThe vmap/vmalloc layer lazily flushes kernel virtual mappings primarily to amortize TLB flushing overheads.\r\nWhat this means is that any page you have now, may, in a former life, have been mapped into kernel virtual\r\naddress by the vmap layer and so there might be some CPUs with TLB entries still referencing that page\r\n(additional to the regular 1:1 kernel mapping).\r\nvm_unmap_aliases flushes all such lazy mappings. After it returns, we can be sure that none of the pages we have\r\ncontrol over will have any aliases from the vmap layer.\r\nvoid vm_unmap_ram (const void * mem, unsigned int count)¶\r\nunmap linear kernel address space set up by vm_map_ram\r\nParameters\r\nconst void * mem\r\nthe pointer returned by vm_map_ram\r\nunsigned int count\r\nthe count passed to that vm_map_ram call (cannot unmap partial)\r\nvoid * vm_map_ram (struct page ** pages, unsigned int count, int node, pgprot_t prot)¶\r\nmap pages linearly into kernel virtual address (vmalloc space)\r\nParameters\r\nstruct page ** pages\r\nan array of pointers to the pages to be mapped\r\nunsigned int count\r\nnumber of pages\r\nint node\r\nprefer to allocate data structures on this node\r\npgprot_t prot\r\nmemory protection to use. PAGE_KERNEL for regular RAM\r\nDescription\r\nIf you use this function for less than VMAP_MAX_ALLOC pages, it could be faster than vmap so it’s good. But\r\nif you mix long-life and short-life objects with vm_map_ram() , it could consume lots of address space through\r\nfragmentation (especially on a 32bit machine). You could see failures in the end. Please use this function for short-lived objects.\r\nReturn\r\na pointer to the address that has been mapped, or NULL on failure\r\nvoid unmap_kernel_range_noflush (unsigned long addr, unsigned long size)¶\r\nunmap kernel VM area\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 84 of 228\n\nParameters\r\nunsigned long addr\r\nstart of the VM area to unmap\r\nunsigned long size\r\nsize of the VM area to unmap\r\nDescription\r\nUnmap PFN_UP(size) pages at addr. The VM area addr and size specify should have been allocated using\r\nget_vm_area() and its friends.\r\nNOTE\r\nThis function does NOT do any cache flushing. The caller is responsible for calling flush_cache_vunmap() on\r\nto-be-mapped areas before calling this function and flush_tlb_kernel_range() after.\r\nvoid unmap_kernel_range (unsigned long addr, unsigned long size)¶\r\nunmap kernel VM area and flush cache and TLB\r\nParameters\r\nunsigned long addr\r\nstart of the VM area to unmap\r\nunsigned long size\r\nsize of the VM area to unmap\r\nDescription\r\nSimilar to unmap_kernel_range_noflush() but flushes vcache before the unmapping and tlb after.\r\nvoid vfree (const void * addr)¶\r\nrelease memory allocated by vmalloc()\r\nParameters\r\nconst void * addr\r\nmemory base address\r\nDescription\r\nFree the virtually continuous memory area starting at addr, as obtained from vmalloc() ,\r\nvmalloc_32() or __vmalloc() . If addr is NULL, no operation is performed.\r\nMust not be called in NMI context (strictly speaking, only if we don’t have\r\nCONFIG_ARCH_HAVE_NMI_SAFE_CMPXCHG, but making the calling conventions for vfree()\r\narch-depenedent would be a really bad idea)\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 85 of 228\n\nNOTE\r\nassumes that the object at addr has a size \u003e= sizeof(llist_node)\r\nvoid vunmap (const void * addr)¶\r\nrelease virtual mapping obtained by vmap()\r\nParameters\r\nconst void * addr\r\nmemory base address\r\nDescription\r\nFree the virtually contiguous memory area starting at addr, which was created from the page array\r\npassed to vmap() .\r\nMust not be called in interrupt context.\r\nvoid * vmap (struct page ** pages, unsigned int count, unsigned long flags, pgprot_t prot)¶\r\nmap an array of pages into virtually contiguous space\r\nParameters\r\nstruct page ** pages\r\narray of page pointers\r\nunsigned int count\r\nnumber of pages to map\r\nunsigned long flags\r\nvm_area-\u003eflags\r\npgprot_t prot\r\npage protection for the mapping\r\nDescription\r\nMaps count pages from pages into contiguous kernel virtual space.\r\nvoid * vmalloc (unsigned long size)¶\r\nallocate virtually contiguous memory\r\nParameters\r\nunsigned long size\r\nallocation size Allocate enough pages to cover size from the page level allocator and map them into\r\ncontiguous kernel virtual space.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 86 of 228\n\nDescription\r\nFor tight control over page level allocator and protection flags use __vmalloc() instead.\r\nvoid * vzalloc (unsigned long size)¶\r\nallocate virtually contiguous memory with zero fill\r\nParameters\r\nunsigned long size\r\nallocation size Allocate enough pages to cover size from the page level allocator and map them into\r\ncontiguous kernel virtual space. The memory allocated is set to zero.\r\nDescription\r\nFor tight control over page level allocator and protection flags use __vmalloc() instead.\r\nvoid * vmalloc_user (unsigned long size)¶\r\nallocate zeroed virtually contiguous memory for userspace\r\nParameters\r\nunsigned long size\r\nallocation size\r\nDescription\r\nThe resulting memory area is zeroed so it can be mapped to userspace without leaking data.\r\nvoid * vmalloc_node (unsigned long size, int node)¶\r\nallocate memory on a specific node\r\nParameters\r\nunsigned long size\r\nallocation size\r\nint node\r\nnuma node\r\nDescription\r\nAllocate enough pages to cover size from the page level allocator and map them into contiguous kernel\r\nvirtual space.\r\nFor tight control over page level allocator and protection flags use __vmalloc() instead.\r\nvoid * vzalloc_node (unsigned long size, int node)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 87 of 228\n\nallocate memory on a specific node with zero fill\r\nParameters\r\nunsigned long size\r\nallocation size\r\nint node\r\nnuma node\r\nDescription\r\nAllocate enough pages to cover size from the page level allocator and map them into contiguous kernel virtual\r\nspace. The memory allocated is set to zero.\r\nFor tight control over page level allocator and protection flags use __vmalloc_node() instead.\r\nvoid * vmalloc_32 (unsigned long size)¶\r\nallocate virtually contiguous memory (32bit addressable)\r\nParameters\r\nunsigned long size\r\nallocation size\r\nDescription\r\nAllocate enough 32bit PA addressable pages to cover size from the page level allocator and map them\r\ninto contiguous kernel virtual space.\r\nvoid * vmalloc_32_user (unsigned long size)¶\r\nallocate zeroed virtually contiguous 32bit memory\r\nParameters\r\nunsigned long size\r\nallocation size\r\nDescription\r\nThe resulting memory area is 32bit addressable and zeroed so it can be mapped to userspace without leaking data.\r\nint remap_vmalloc_range_partial (struct vm_area_struct * vma, unsigned long uaddr, void * kaddr, unsigned\r\nlong size)¶\r\nmap vmalloc pages to userspace\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 88 of 228\n\nstruct vm_area_struct * vma\r\nvma to cover\r\nunsigned long uaddr\r\ntarget user address to start at\r\nvoid * kaddr\r\nvirtual address of vmalloc kernel memory\r\nunsigned long size\r\nsize of map area\r\nReturn\r\n0 for success, -Exxx on failure\r\nThis function checks that kaddr is a valid vmalloc’ed area, and that it is big enough to cover the range\r\nstarting at uaddr in vma. Will return failure if that criteria isn’t met.\r\nSimilar to remap_pfn_range() (see mm/memory.c)\r\nint remap_vmalloc_range (struct vm_area_struct * vma, void * addr, unsigned long pgoff)¶\r\nmap vmalloc pages to userspace\r\nParameters\r\nstruct vm_area_struct * vma\r\nvma to cover (map full range of vma)\r\nvoid * addr\r\nvmalloc memory\r\nunsigned long pgoff\r\nnumber of pages into addr before first page to map\r\nReturn\r\n0 for success, -Exxx on failure\r\nThis function checks that addr is a valid vmalloc’ed area, and that it is big enough to cover the vma.\r\nWill return failure if that criteria isn’t met.\r\nSimilar to remap_pfn_range() (see mm/memory.c)\r\nstruct vm_struct * alloc_vm_area (size_t size, pte_t ** ptes)¶\r\nallocate a range of kernel address space\r\nParameters\r\nsize_t size\r\nsize of the area\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 89 of 228\n\npte_t ** ptes\r\nreturns the PTEs for the address space\r\nReturn\r\nNULL on failure, vm_struct on success\r\nThis function reserves a range of kernel address space, and allocates pagetables to map that range. No\r\nactual mappings are created.\r\nIf ptes is non-NULL, pointers to the PTEs (in init_mm) allocated for the VM area are returned.\r\nunsigned long __get_pfnblock_flags_mask (struct page * page, unsigned long pfn, unsigned long end_bitidx,\r\nunsigned long mask)¶\r\nReturn the requested group of flags for the pageblock_nr_pages block of pages\r\nParameters\r\nstruct page * page\r\nThe page within the block of interest\r\nunsigned long pfn\r\nThe target page frame number\r\nunsigned long end_bitidx\r\nThe last bit of interest to retrieve\r\nunsigned long mask\r\nmask of bits that the caller is interested in\r\nReturn\r\npageblock_bits flags\r\nvoid set_pfnblock_flags_mask (struct page * page, unsigned long flags, unsigned long pfn, unsigned\r\nlong end_bitidx, unsigned long mask)¶\r\nSet the requested group of flags for a pageblock_nr_pages block of pages\r\nParameters\r\nstruct page * page\r\nThe page within the block of interest\r\nunsigned long flags\r\nThe flags to set\r\nunsigned long pfn\r\nThe target page frame number\r\nunsigned long end_bitidx\r\nThe last bit of interest\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 90 of 228\n\nunsigned long mask\r\nmask of bits that the caller is interested in\r\nvoid * alloc_pages_exact_nid (int nid, size_t size, gfp_t gfp_mask)¶\r\nallocate an exact number of physically-contiguous pages on a node.\r\nParameters\r\nint nid\r\nthe preferred node ID where memory should be allocated\r\nsize_t size\r\nthe number of bytes to allocate\r\ngfp_t gfp_mask\r\nGFP flags for the allocation\r\nDescription\r\nLike alloc_pages_exact() , but try to allocate on node nid first before falling back.\r\nunsigned long nr_free_zone_pages (int offset)¶\r\ncount number of pages beyond high watermark\r\nParameters\r\nint offset\r\nThe zone index of the highest zone\r\nDescription\r\nnr_free_zone_pages() counts the number of counts pages which are beyond the high watermark within all zones\r\nat or below a given zone index. For each zone, the number of pages is calculated as:\r\nnr_free_zone_pages = managed_pages - high_pages\r\nunsigned long nr_free_pagecache_pages (void)¶\r\ncount number of pages beyond high watermark\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nnr_free_pagecache_pages() counts the number of pages which are beyond the high watermark within all zones.\r\nint find_next_best_node (int node, nodemask_t * used_node_mask)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 91 of 228\n\nfind the next node that should appear in a given node’s fallback list\r\nParameters\r\nint node\r\nnode whose fallback list we’re appending\r\nnodemask_t * used_node_mask\r\nnodemask_t of already used nodes\r\nDescription\r\nWe use a number of factors to determine which is the next node that should appear on a given node’s fallback list.\r\nThe node should not have appeared already in node‘s fallback list, and it should be the next closest node\r\naccording to the distance array (which contains arbitrary distance values from each node to each node in the\r\nsystem), and should also prefer nodes with no CPUs, since presumably they’ll have very little allocation pressure\r\non them otherwise. It returns -1 if no node is found.\r\nvoid free_bootmem_with_active_regions (int nid, unsigned long max_low_pfn)¶\r\nCall memblock_free_early_nid for each active range\r\nParameters\r\nint nid\r\nThe node to free memory on. If MAX_NUMNODES, all nodes are freed.\r\nunsigned long max_low_pfn\r\nThe highest PFN that will be passed to memblock_free_early_nid\r\nDescription\r\nIf an architecture guarantees that all ranges registered contain no holes and may be freed, this this function may be\r\nused instead of calling memblock_free_early_nid() manually.\r\nvoid sparse_memory_present_with_active_regions (int nid)¶\r\nCall memory_present for each active range\r\nParameters\r\nint nid\r\nThe node to call memory_present for. If MAX_NUMNODES, all nodes will be used.\r\nDescription\r\nIf an architecture guarantees that all ranges registered contain no holes and may be freed, this function may be\r\nused instead of calling memory_present() manually.\r\nvoid get_pfn_range_for_nid (unsigned int nid, unsigned long * start_pfn, unsigned long * end_pfn)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 92 of 228\n\nReturn the start and end page frames for a node\r\nParameters\r\nunsigned int nid\r\nThe nid to return the range for. If MAX_NUMNODES, the min and max PFN are returned.\r\nunsigned long * start_pfn\r\nPassed by reference. On return, it will have the node start_pfn.\r\nunsigned long * end_pfn\r\nPassed by reference. On return, it will have the node end_pfn.\r\nDescription\r\nIt returns the start and end page frame of a node based on information provided by memblock_set_node() . If\r\ncalled for a node with no available memory, a warning is printed and the start and end PFNs will be 0.\r\nunsigned long absent_pages_in_range (unsigned long start_pfn, unsigned long end_pfn)¶\r\nReturn number of page frames in holes within a range\r\nParameters\r\nunsigned long start_pfn\r\nThe start PFN to start searching for holes\r\nunsigned long end_pfn\r\nThe end PFN to stop searching for holes\r\nDescription\r\nIt returns the number of pages frames in memory holes within a range.\r\nunsigned long node_map_pfn_alignment (void)¶\r\ndetermine the maximum internode alignment\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nThis function should be called after node map is populated and sorted. It calculates the maximum power of two\r\nalignment which can distinguish all the nodes.\r\nFor example, if all nodes are 1GiB and aligned to 1GiB, the return value would indicate 1GiB alignment with (1\r\n\u003c\u003c (30 - PAGE_SHIFT)). If the nodes are shifted by 256MiB, 256MiB. Note that if only the last node is shifted,\r\n1GiB is enough and this function will indicate so.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 93 of 228\n\nThis is used to test whether pfn -\u003e nid mapping of the chosen memory model has fine enough granularity to avoid\r\nincorrect mapping for the populated node map.\r\nReturns the determined alignment in pfn’s. 0 if there is no alignment requirement (single node).\r\nunsigned long find_min_pfn_with_active_regions (void)¶\r\nFind the minimum PFN registered\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nIt returns the minimum PFN based on information provided via memblock_set_node() .\r\nvoid free_area_init_nodes (unsigned long * max_zone_pfn)¶\r\nInitialise all pg_data_t and zone data\r\nParameters\r\nunsigned long * max_zone_pfn\r\nan array of max PFNs for each zone\r\nDescription\r\nThis will call free_area_init_node() for each active node in the system. Using the page ranges provided by\r\nmemblock_set_node() , the size of each zone in each node and their holes is calculated. If the maximum PFN\r\nbetween two adjacent zones match, it is assumed that the zone is empty. For example, if arch_max_dma_pfn ==\r\narch_max_dma32_pfn, it is assumed that arch_max_dma32_pfn has no pages. It is also assumed that a zone starts\r\nwhere the previous one ended. For example, ZONE_DMA32 starts at arch_max_dma_pfn.\r\nvoid set_dma_reserve (unsigned long new_dma_reserve)¶\r\nset the specified number of pages reserved in the first zone\r\nParameters\r\nunsigned long new_dma_reserve\r\nThe number of pages to mark reserved\r\nDescription\r\nThe per-cpu batchsize and zone watermarks are determined by managed_pages. In the DMA zone, a significant\r\npercentage may be consumed by kernel image and other unfreeable allocations which can skew the watermarks\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 94 of 228\n\nbadly. This function may optionally be used to account for unfreeable pages in the first zone (e.g., ZONE_DMA).\r\nThe effect will be lower watermarks and smaller per-cpu batchsize.\r\nvoid setup_per_zone_wmarks (void)¶\r\ncalled when min_free_kbytes changes or when memory is hot-{added|removed}\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nEnsures that the watermark[min,low,high] values for each zone are set correctly with respect to min_free_kbytes.\r\nint alloc_contig_range (unsigned long start, unsigned long end, unsigned migratetype, gfp_t gfp_mask)¶\r\ntries to allocate given range of pages\r\nParameters\r\nunsigned long start\r\nstart PFN to allocate\r\nunsigned long end\r\none-past-the-last PFN to allocate\r\nunsigned migratetype\r\nmigratetype of the underlaying pageblocks (either #MIGRATE_MOVABLE or #MIGRATE_CMA). All\r\npageblocks in range must have the same migratetype and it must be either of the two.\r\ngfp_t gfp_mask\r\nGFP mask to use during compaction\r\nDescription\r\nThe PFN range does not have to be pageblock or MAX_ORDER_NR_PAGES aligned, however it’s the caller’s\r\nresponsibility to guarantee that we are the only thread that changes migrate type of pageblocks the pages fall in.\r\nThe PFN range must belong to a single zone.\r\nReturns zero on success or negative error code. On success all pages which PFN is in [start, end) are allocated for\r\nthe caller and need to be freed with free_contig_range() .\r\nvoid mempool_destroy (mempool_t * pool)¶\r\ndeallocate a memory pool\r\nParameters\r\nmempool_t * pool\r\npointer to the memory pool which was allocated via mempool_create() .\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 95 of 228\n\nDescription\r\nFree all reserved elements in pool and pool itself. This function only sleeps if the free_fn() function sleeps.\r\nmempool_t * mempool_create (int min_nr, mempool_alloc_t * alloc_fn, mempool_free_t * free_fn, void\r\n* pool_data)¶\r\ncreate a memory pool\r\nParameters\r\nint min_nr\r\nthe minimum number of elements guaranteed to be allocated for this pool.\r\nmempool_alloc_t * alloc_fn\r\nuser-defined element-allocation function.\r\nmempool_free_t * free_fn\r\nuser-defined element-freeing function.\r\nvoid * pool_data\r\noptional private data available to the user-defined functions.\r\nDescription\r\nthis function creates and allocates a guaranteed size, preallocated memory pool. The pool can be used from the\r\nmempool_alloc() and mempool_free() functions. This function might sleep. Both the alloc_fn() and the\r\nfree_fn() functions might sleep - as long as the mempool_alloc() function is not called from IRQ contexts.\r\nint mempool_resize (mempool_t * pool, int new_min_nr)¶\r\nresize an existing memory pool\r\nParameters\r\nmempool_t * pool\r\npointer to the memory pool which was allocated via mempool_create() .\r\nint new_min_nr\r\nthe new minimum number of elements guaranteed to be allocated for this pool.\r\nDescription\r\nThis function shrinks/grows the pool. In the case of growing, it cannot be guaranteed that the pool will be grown\r\nto the new size immediately, but new mempool_free() calls will refill it. This function may sleep.\r\nNote, the caller must guarantee that no mempool_destroy is called while this function is running.\r\nmempool_alloc() \u0026 mempool_free() might be called (eg. from IRQ contexts) while this function executes.\r\nvoid * mempool_alloc (mempool_t * pool, gfp_t gfp_mask)¶\r\nallocate an element from a specific memory pool\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 96 of 228\n\nParameters\r\nmempool_t * pool\r\npointer to the memory pool which was allocated via mempool_create() .\r\ngfp_t gfp_mask\r\nthe usual allocation bitmask.\r\nDescription\r\nthis function only sleeps if the alloc_fn() function sleeps or returns NULL. Note that due to preallocation, this\r\nfunction never fails when called from process contexts. (it might fail if called from an IRQ context.)\r\nNote\r\nusing __GFP_ZERO is not supported.\r\nvoid mempool_free (void * element, mempool_t * pool)¶\r\nreturn an element to the pool.\r\nParameters\r\nvoid * element\r\npool element pointer.\r\nmempool_t * pool\r\npointer to the memory pool which was allocated via mempool_create() .\r\nDescription\r\nthis function only sleeps if the free_fn() function sleeps.\r\nstruct dma_pool * dma_pool_create (const char * name, struct device * dev, size_t size, size_t align,\r\nsize_t boundary)¶\r\nCreates a pool of consistent memory blocks, for dma.\r\nParameters\r\nconst char * name\r\nname of pool, for diagnostics\r\nstruct device * dev\r\ndevice that will be doing the DMA\r\nsize_t size\r\nsize of the blocks in this pool.\r\nsize_t align\r\nalignment requirement for blocks; must be a power of two\r\nsize_t boundary\r\nreturned blocks won’t cross this power of two boundary\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 97 of 228\n\nContext\r\n!:c:func:in_interrupt()\r\nDescription\r\nReturns a dma allocation pool with the requested characteristics, or null if one can’t be created. Given one of these\r\npools, dma_pool_alloc() may be used to allocate memory. Such memory will all have “consistent” DMA\r\nmappings, accessible by the device and its driver without using cache flushing primitives. The actual size of\r\nblocks allocated may be larger than requested because of alignment.\r\nIf boundary is nonzero, objects returned from dma_pool_alloc() won’t cross that size boundary. This is useful\r\nfor devices which have addressing restrictions on individual DMA transfers, such as not crossing boundaries of\r\n4KBytes.\r\nvoid dma_pool_destroy (struct dma_pool * pool)¶\r\ndestroys a pool of dma memory blocks.\r\nParameters\r\nstruct dma_pool * pool\r\ndma pool that will be destroyed\r\nContext\r\n!:c:func:in_interrupt()\r\nDescription\r\nCaller guarantees that no more memory from the pool is in use, and that nothing will try to use the pool after this\r\ncall.\r\nvoid * dma_pool_alloc (struct dma_pool * pool, gfp_t mem_flags, dma_addr_t * handle)¶\r\nget a block of consistent memory\r\nParameters\r\nstruct dma_pool * pool\r\ndma pool that will produce the block\r\ngfp_t mem_flags\r\nGFP_* bitmask\r\ndma_addr_t * handle\r\npointer to dma address of block\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 98 of 228\n\nThis returns the kernel virtual address of a currently unused block, and reports its dma address through the handle.\r\nIf such a memory block can’t be allocated, NULL is returned.\r\nvoid dma_pool_free (struct dma_pool * pool, void * vaddr, dma_addr_t dma)¶\r\nput block back into dma pool\r\nParameters\r\nstruct dma_pool * pool\r\nthe dma pool holding the block\r\nvoid * vaddr\r\nvirtual address of block\r\ndma_addr_t dma\r\ndma address of block\r\nDescription\r\nCaller promises neither device nor driver will again touch this block unless it is first re-allocated.\r\nstruct dma_pool * dmam_pool_create (const char * name, struct device * dev, size_t size, size_t align,\r\nsize_t allocation)¶\r\nManaged dma_pool_create()\r\nParameters\r\nconst char * name\r\nname of pool, for diagnostics\r\nstruct device * dev\r\ndevice that will be doing the DMA\r\nsize_t size\r\nsize of the blocks in this pool.\r\nsize_t align\r\nalignment requirement for blocks; must be a power of two\r\nsize_t allocation\r\nreturned blocks won’t cross this boundary (or zero)\r\nDescription\r\nManaged dma_pool_create() . DMA pool created with this function is automatically destroyed on driver detach.\r\nvoid dmam_pool_destroy (struct dma_pool * pool)¶\r\nManaged dma_pool_destroy()\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 99 of 228\n\nstruct dma_pool * pool\r\ndma pool that will be destroyed\r\nDescription\r\nManaged dma_pool_destroy() .\r\nvoid balance_dirty_pages_ratelimited (struct address_space * mapping)¶\r\nbalance dirty memory state\r\nParameters\r\nstruct address_space * mapping\r\naddress_space which was dirtied\r\nDescription\r\nProcesses which are dirtying memory should call in here once for each page which was newly dirtied. The\r\nfunction will periodically check the system’s dirty state and will initiate writeback if needed.\r\nOn really big machines, get_writeback_state is expensive, so try to avoid calling it too often (ratelimiting). But\r\nonce we’re over the dirty memory limit we decrease the ratelimiting by a lot, to prevent individual processes from\r\novershooting the limit by (ratelimit_pages) each.\r\nvoid tag_pages_for_writeback (struct address_space * mapping, pgoff_t start, pgoff_t end)¶\r\ntag pages to be written by write_cache_pages\r\nParameters\r\nstruct address_space * mapping\r\naddress space structure to write\r\npgoff_t start\r\nstarting page index\r\npgoff_t end\r\nending page index (inclusive)\r\nDescription\r\nThis function scans the page range from start to end (inclusive) and tags all pages that have DIRTY tag set with a\r\nspecial TOWRITE tag. The idea is that write_cache_pages (or whoever calls this function) will then use\r\nTOWRITE tag to identify pages eligible for writeback. This mechanism is used to avoid livelocking of writeback\r\nby a process steadily creating new dirty pages in the file (thus it is important for this function to be quick so that it\r\ncan tag pages faster than a dirtying process can create them).\r\nint write_cache_pages (struct address_space * mapping, struct writeback_control * wbc, writepage_t writepage,\r\nvoid * data)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 100 of 228\n\nwalk the list of dirty pages of the given address space and write all of them.\r\nParameters\r\nstruct address_space * mapping\r\naddress space structure to write\r\nstruct writeback_control * wbc\r\nsubtract the number of written pages from *wbc-\u003enr_to_write\r\nwritepage_t writepage\r\nfunction called for each page\r\nvoid * data\r\ndata passed to writepage function\r\nDescription\r\nIf a page is already under I/O, write_cache_pages() skips it, even if it’s dirty. This is desirable behaviour for\r\nmemory-cleaning writeback, but it is INCORRECT for data-integrity system calls such as fsync() . fsync()\r\nand msync() need to guarantee that all the data which was dirty at the time the call was made get new I/O started\r\nagainst them. If wbc-\u003esync_mode is WB_SYNC_ALL then we were called for data integrity and we must wait for\r\nexisting IO to complete.\r\nTo avoid livelocks (when other process dirties new pages), we first tag pages which should be written back with\r\nTOWRITE tag and only then start writing them. For data-integrity sync we have to be careful so that we do not\r\nmiss some pages (e.g., because some other process has cleared TOWRITE tag we set). The rule we follow is that\r\nTOWRITE tag can be cleared only by the process clearing the DIRTY tag (and submitting the page for IO).\r\nint generic_writepages (struct address_space * mapping, struct writeback_control * wbc)¶\r\nwalk the list of dirty pages of the given address space and writepage() all of them.\r\nParameters\r\nstruct address_space * mapping\r\naddress space structure to write\r\nstruct writeback_control * wbc\r\nsubtract the number of written pages from *wbc-\u003enr_to_write\r\nDescription\r\nThis is a library function, which implements the writepages() address_space_operation.\r\nint write_one_page (struct page * page, int wait)¶\r\nwrite out a single page and optionally wait on I/O\r\nParameters\r\nstruct page * page\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 101 of 228\n\nthe page to write\r\nint wait\r\nif true, wait on writeout\r\nDescription\r\nThe page must be locked by the caller and will be unlocked upon return.\r\nwrite_one_page() returns a negative error code if I/O failed.\r\nvoid wait_for_stable_page (struct page * page)¶\r\nwait for writeback to finish, if necessary.\r\nParameters\r\nstruct page * page\r\nThe page to wait on.\r\nDescription\r\nThis function determines if the given page is related to a backing device that requires page contents to be held\r\nstable during writeback. If so, then it will wait for any pending writeback to complete.\r\nvoid truncate_inode_pages_range (struct address_space * mapping, loff_t lstart, loff_t lend)¶\r\ntruncate range of pages specified by start \u0026 end byte offsets\r\nParameters\r\nstruct address_space * mapping\r\nmapping to truncate\r\nloff_t lstart\r\noffset from which to truncate\r\nloff_t lend\r\noffset to which to truncate (inclusive)\r\nDescription\r\nTruncate the page cache, removing the pages that are between specified offsets (and zeroing out partial pages if\r\nlstart or lend + 1 is not page aligned).\r\nTruncate takes two passes - the first pass is nonblocking. It will not block on page locks and it will not block on\r\nwriteback. The second pass will wait. This is to prevent as much IO as possible in the affected region. The first\r\npass will remove most pages, so the search cost of the second pass is low.\r\nWe pass down the cache-hot hint to the page freeing code. Even if the mapping is large, it is probably the case that\r\nthe final pages are the most recently touched, and freeing happens in ascending file offset order.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 102 of 228\n\nNote that since -\u003e:c:func:invalidatepage() accepts range to invalidate truncate_inode_pages_range is able to\r\nhandle cases where lend + 1 is not page aligned properly.\r\nvoid truncate_inode_pages (struct address_space * mapping, loff_t lstart)¶\r\ntruncate all the pages from an offset\r\nParameters\r\nstruct address_space * mapping\r\nmapping to truncate\r\nloff_t lstart\r\noffset from which to truncate\r\nDescription\r\nCalled under (and serialised by) inode-\u003ei_mutex.\r\nNote\r\nWhen this function returns, there can be a page in the process of deletion (inside __delete_from_page_cache() )\r\nin the specified range. Thus mapping-\u003enrpages can be non-zero when this function returns even after truncation of\r\nthe whole mapping.\r\nvoid truncate_inode_pages_final (struct address_space * mapping)¶\r\ntruncate all pages before inode dies\r\nParameters\r\nstruct address_space * mapping\r\nmapping to truncate\r\nDescription\r\nCalled under (and serialized by) inode-\u003ei_mutex.\r\nFilesystems have to use this in the .evict_inode path to inform the VM that this is the final truncate and the inode\r\nis going away.\r\nunsigned long invalidate_mapping_pages (struct address_space * mapping, pgoff_t start, pgoff_t end)¶\r\nInvalidate all the unlocked pages of one inode\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space which holds the pages to invalidate\r\npgoff_t start\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 103 of 228\n\nthe offset ‘from’ which to invalidate\r\npgoff_t end\r\nthe offset ‘to’ which to invalidate (inclusive)\r\nDescription\r\nThis function only removes the unlocked pages, if you want to remove all the pages of one inode, you must call\r\ntruncate_inode_pages.\r\ninvalidate_mapping_pages() will not block on IO activity. It will not invalidate pages which are dirty, locked,\r\nunder writeback or mapped into pagetables.\r\nint invalidate_inode_pages2_range (struct address_space * mapping, pgoff_t start, pgoff_t end)¶\r\nremove range of pages from an address_space\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space\r\npgoff_t start\r\nthe page offset ‘from’ which to invalidate\r\npgoff_t end\r\nthe page offset ‘to’ which to invalidate (inclusive)\r\nDescription\r\nAny pages which are found to be mapped into pagetables are unmapped prior to invalidation.\r\nReturns -EBUSY if any pages could not be invalidated.\r\nint invalidate_inode_pages2 (struct address_space * mapping)¶\r\nremove all pages from an address_space\r\nParameters\r\nstruct address_space * mapping\r\nthe address_space\r\nDescription\r\nAny pages which are found to be mapped into pagetables are unmapped prior to invalidation.\r\nReturns -EBUSY if any pages could not be invalidated.\r\nvoid truncate_pagecache (struct inode * inode, loff_t newsize)¶\r\nunmap and remove pagecache that has been truncated\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 104 of 228\n\nParameters\r\nstruct inode * inode\r\ninode\r\nloff_t newsize\r\nnew file size\r\nDescription\r\ninode’s new i_size must already be written before truncate_pagecache is called.\r\nThis function should typically be called before the filesystem releases resources associated with the freed range\r\n(eg. deallocates blocks). This way, pagecache will always stay logically coherent with on-disk format, and the\r\nfilesystem would not have to deal with situations such as writepage being called for a page that has already had its\r\nunderlying blocks deallocated.\r\nvoid truncate_setsize (struct inode * inode, loff_t newsize)¶\r\nupdate inode and pagecache for a new file size\r\nParameters\r\nstruct inode * inode\r\ninode\r\nloff_t newsize\r\nnew file size\r\nDescription\r\ntruncate_setsize updates i_size and performs pagecache truncation (if necessary) to newsize. It will be typically be\r\ncalled from the filesystem’s setattr function when ATTR_SIZE is passed in.\r\nMust be called with a lock serializing truncates and writes (generally i_mutex but e.g. xfs uses a different lock)\r\nand before all filesystem specific block truncation has been performed.\r\nvoid pagecache_isize_extended (struct inode * inode, loff_t from, loff_t to)¶\r\nupdate pagecache after extension of i_size\r\nParameters\r\nstruct inode * inode\r\ninode for which i_size was extended\r\nloff_t from\r\noriginal inode size\r\nloff_t to\r\nnew inode size\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 105 of 228\n\nDescription\r\nHandle extension of inode size either caused by extending truncate or by write starting after current i_size. We\r\nmark the page straddling current i_size RO so that page_mkwrite() is called on the nearest write access to the\r\npage. This way filesystem can be sure that page_mkwrite() is called on the page before user writes to the page\r\nvia mmap after the i_size has been changed.\r\nThe function must be called after i_size is updated so that page fault coming after we unlock the page will already\r\nsee the new i_size. The function must be called while we still hold i_mutex - this not only makes sure i_size is\r\nstable but also that userspace cannot observe new i_size value before we are prepared to store mmap writes at new\r\ninode size.\r\nvoid truncate_pagecache_range (struct inode * inode, loff_t lstart, loff_t lend)¶\r\nunmap and remove pagecache that is hole-punched\r\nParameters\r\nstruct inode * inode\r\ninode\r\nloff_t lstart\r\noffset of beginning of hole\r\nloff_t lend\r\noffset of last byte of hole\r\nDescription\r\nThis function should typically be called before the filesystem releases resources associated with the freed range\r\n(eg. deallocates blocks). This way, pagecache will always stay logically coherent with on-disk format, and the\r\nfilesystem would not have to deal with situations such as writepage being called for a page that has already had its\r\nunderlying blocks deallocated.\r\nKernel IPC facilities¶\r\nIPC utilities¶\r\nint ipc_init (void)¶\r\ninitialise ipc subsystem\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nThe various sysv ipc resources (semaphores, messages and shared memory) are initialised.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 106 of 228\n\nA callback routine is registered into the memory hotplug notifier chain: since msgmni scales to lowmem this\r\ncallback routine will be called upon successful memory add / remove to recompute msmgni.\r\nvoid ipc_init_ids (struct ipc_ids * ids)¶\r\ninitialise ipc identifiers\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nDescription\r\nSet up the sequence range to use for the ipc identifier range (limited below IPCMNI) then initialise the ids idr.\r\nvoid ipc_init_proc_interface (const char * path, const char * header, int ids, int (*show) (struct seq_file *,\r\nvoid *)¶\r\ncreate a proc interface for sysipc types using a seq_file interface.\r\nParameters\r\nconst char * path\r\nPath in procfs\r\nconst char * header\r\nBanner to be printed at the beginning of the file.\r\nint ids\r\nipc id table to iterate.\r\nint (*)(struct seq_file *, void *) show\r\nshow routine.\r\nstruct kern_ipc_perm * ipc_findkey (struct ipc_ids * ids, key_t key)¶\r\nfind a key in an ipc identifier set\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nkey_t key\r\nkey to find\r\nDescription\r\nReturns the locked pointer to the ipc structure if found or NULL otherwise. If key is found ipc points to the\r\nowning ipc structure\r\nCalled with ipc_ids.rwsem held.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 107 of 228\n\nint ipc_get_maxid (struct ipc_ids * ids)¶\r\nget the last assigned id\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nDescription\r\nCalled with ipc_ids.rwsem held.\r\nint ipc_addid (struct ipc_ids * ids, struct kern_ipc_perm * new, int size)¶\r\nadd an ipc identifier\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nstruct kern_ipc_perm * new\r\nnew ipc permission set\r\nint size\r\nlimit for the number of used ids\r\nDescription\r\nAdd an entry ‘new’ to the ipc ids idr. The permissions object is initialised and the first free entry is set up and the\r\nid assigned is returned. The ‘new’ entry is returned in a locked state on success. On failure the entry is not locked\r\nand a negative err-code is returned.\r\nCalled with writer ipc_ids.rwsem held.\r\nint ipcget_new (struct ipc_namespace * ns, struct ipc_ids * ids, const struct ipc_ops * ops, struct ipc_params\r\n* params)¶\r\ncreate a new ipc object\r\nParameters\r\nstruct ipc_namespace * ns\r\nipc namespace\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nconst struct ipc_ops * ops\r\nthe actual creation routine to call\r\nstruct ipc_params * params\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 108 of 228\n\nits parameters\r\nDescription\r\nThis routine is called by sys_msgget, sys_semget() and sys_shmget() when the key is IPC_PRIVATE.\r\nint ipc_check_perms (struct ipc_namespace * ns, struct kern_ipc_perm * ipcp, const struct ipc_ops * ops, struct\r\nipc_params * params)¶\r\ncheck security and permissions for an ipc object\r\nParameters\r\nstruct ipc_namespace * ns\r\nipc namespace\r\nstruct kern_ipc_perm * ipcp\r\nipc permission set\r\nconst struct ipc_ops * ops\r\nthe actual security routine to call\r\nstruct ipc_params * params\r\nits parameters\r\nDescription\r\nThis routine is called by sys_msgget() , sys_semget() and sys_shmget() when the key is not IPC_PRIVATE\r\nand that key already exists in the ds IDR.\r\nOn success, the ipc id is returned.\r\nIt is called with ipc_ids.rwsem and ipcp-\u003elock held.\r\nint ipcget_public (struct ipc_namespace * ns, struct ipc_ids * ids, const struct ipc_ops * ops, struct ipc_params\r\n* params)¶\r\nget an ipc object or create a new one\r\nParameters\r\nstruct ipc_namespace * ns\r\nipc namespace\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nconst struct ipc_ops * ops\r\nthe actual creation routine to call\r\nstruct ipc_params * params\r\nits parameters\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 109 of 228\n\nThis routine is called by sys_msgget, sys_semget() and sys_shmget() when the key is not IPC_PRIVATE. It\r\nadds a new entry if the key is not found and does some permission / security checkings if the key is found.\r\nOn success, the ipc id is returned.\r\nvoid ipc_rmid (struct ipc_ids * ids, struct kern_ipc_perm * ipcp)¶\r\nremove an ipc identifier\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nstruct kern_ipc_perm * ipcp\r\nipc perm structure containing the identifier to remove\r\nDescription\r\nipc_ids.rwsem (as a writer) and the spinlock for this ID are held before this function is called, and remain locked\r\non the exit.\r\nvoid * ipc_alloc (int size)¶\r\nallocate ipc space\r\nParameters\r\nint size\r\nsize desired\r\nDescription\r\nAllocate memory from the appropriate pools and return a pointer to it. NULL is returned if the allocation fails\r\nvoid ipc_free (void * ptr)¶\r\nfree ipc space\r\nParameters\r\nvoid * ptr\r\npointer returned by ipc_alloc\r\nDescription\r\nFree a block created with ipc_alloc() .\r\nvoid * ipc_rcu_alloc (int size)¶\r\nallocate ipc and rcu space\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 110 of 228\n\nParameters\r\nint size\r\nsize desired\r\nDescription\r\nAllocate memory for the rcu header structure + the object. Returns the pointer to the object or NULL upon failure.\r\nint ipcperms (struct ipc_namespace * ns, struct kern_ipc_perm * ipcp, short flag)¶\r\ncheck ipc permissions\r\nParameters\r\nstruct ipc_namespace * ns\r\nipc namespace\r\nstruct kern_ipc_perm * ipcp\r\nipc permission set\r\nshort flag\r\ndesired permission set\r\nDescription\r\nCheck user, group, other permissions for access to ipc resources. return 0 if allowed\r\nflag will most probably be 0 or S_...UGO from \u003clinux/stat.h\u003e\r\nvoid kernel_to_ipc64_perm (struct kern_ipc_perm * in, struct ipc64_perm * out)¶\r\nconvert kernel ipc permissions to user\r\nParameters\r\nstruct kern_ipc_perm * in\r\nkernel permissions\r\nstruct ipc64_perm * out\r\nnew style ipc permissions\r\nDescription\r\nTurn the kernel object in into a set of permissions descriptions for returning to userspace (out).\r\nvoid ipc64_perm_to_ipc_perm (struct ipc64_perm * in, struct ipc_perm * out)¶\r\nconvert new ipc permissions to old\r\nParameters\r\nstruct ipc64_perm * in\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 111 of 228\n\nnew style ipc permissions\r\nstruct ipc_perm * out\r\nold style ipc permissions\r\nDescription\r\nTurn the new style permissions object in into a compatibility object and store it into the out pointer.\r\nstruct kern_ipc_perm * ipc_obtain_object_idr (struct ipc_ids * ids, int id)¶\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nint id\r\nipc id to look for\r\nDescription\r\nLook for an id in the ipc ids idr and return associated ipc object.\r\nCall inside the RCU critical section. The ipc object is not locked on exit.\r\nstruct kern_ipc_perm * ipc_lock (struct ipc_ids * ids, int id)¶\r\nlock an ipc structure without rwsem held\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nint id\r\nipc id to look for\r\nDescription\r\nLook for an id in the ipc ids idr and lock the associated ipc object.\r\nThe ipc object is locked on successful exit.\r\nstruct kern_ipc_perm * ipc_obtain_object_check (struct ipc_ids * ids, int id)¶\r\nParameters\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nint id\r\nipc id to look for\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 112 of 228\n\nDescription\r\nSimilar to ipc_obtain_object_idr() but also checks the ipc object reference counter.\r\nCall inside the RCU critical section. The ipc object is not locked on exit.\r\nint ipcget (struct ipc_namespace * ns, struct ipc_ids * ids, const struct ipc_ops * ops, struct ipc_params\r\n* params)¶\r\nCommon sys_*:c:func:get() code\r\nParameters\r\nstruct ipc_namespace * ns\r\nnamespace\r\nstruct ipc_ids * ids\r\nipc identifier set\r\nconst struct ipc_ops * ops\r\noperations to be called on ipc object creation, permission checks and further checks\r\nstruct ipc_params * params\r\nthe parameters needed by the previous operations.\r\nDescription\r\nCommon routine called by sys_msgget() , sys_semget() and sys_shmget() .\r\nint ipc_update_perm (struct ipc64_perm * in, struct kern_ipc_perm * out)¶\r\nupdate the permissions of an ipc object\r\nParameters\r\nstruct ipc64_perm * in\r\nthe permission given as input.\r\nstruct kern_ipc_perm * out\r\nthe permission of the ipc to set.\r\nstruct kern_ipc_perm * ipcctl_pre_down_nolock (struct ipc_namespace * ns, struct ipc_ids * ids, int id, int cmd,\r\nstruct ipc64_perm * perm, int extra_perm)¶\r\nretrieve an ipc and check permissions for some IPC_XXX cmd\r\nParameters\r\nstruct ipc_namespace * ns\r\nipc namespace\r\nstruct ipc_ids * ids\r\nthe table of ids where to look for the ipc\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 113 of 228\n\nint id\r\nthe id of the ipc to retrieve\r\nint cmd\r\nthe cmd to check\r\nstruct ipc64_perm * perm\r\nthe permission to set\r\nint extra_perm\r\none extra permission parameter used by msq\r\nDescription\r\nThis function does some common audit and permissions check for some IPC_XXX cmd and is called from\r\nsemctl_down, shmctl_down and msgctl_down. It must be called without any lock held and:\r\nretrieves the ipc with the given id in the given table.\r\nperforms some audit and permission check, depending on the given cmd\r\nreturns a pointer to the ipc object or otherwise, the corresponding error.\r\nCall holding the both the rwsem and the rcu read lock.\r\nint ipc_parse_version (int * cmd)¶\r\nipc call version\r\nParameters\r\nint * cmd\r\npointer to command\r\nDescription\r\nReturn IPC_64 for new style IPC and IPC_OLD for old style IPC. The cmd value is turned from an encoding\r\ncommand and version into just the command code.\r\nFIFO Buffer¶\r\nkfifo interface¶\r\nDECLARE_KFIFO_PTR (fifo, type)¶\r\nmacro to declare a fifo pointer object\r\nParameters\r\nfifo\r\nname of the declared fifo\r\ntype\r\ntype of the fifo elements\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 114 of 228\n\nDECLARE_KFIFO (fifo, type, size)¶\r\nmacro to declare a fifo object\r\nParameters\r\nfifo\r\nname of the declared fifo\r\ntype\r\ntype of the fifo elements\r\nsize\r\nthe number of elements in the fifo, this must be a power of 2\r\nINIT_KFIFO (fifo)¶\r\nInitialize a fifo declared by DECLARE_KFIFO\r\nParameters\r\nfifo\r\nname of the declared fifo datatype\r\nDEFINE_KFIFO (fifo, type, size)¶\r\nmacro to define and initialize a fifo\r\nParameters\r\nfifo\r\nname of the declared fifo datatype\r\ntype\r\ntype of the fifo elements\r\nsize\r\nthe number of elements in the fifo, this must be a power of 2\r\nNote\r\nthe macro can be used for global and local fifo data type variables.\r\nkfifo_initialized (fifo)¶\r\nCheck if the fifo is initialized\r\nParameters\r\nfifo\r\naddress of the fifo to check\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 115 of 228\n\nReturn true if fifo is initialized, otherwise false . Assumes the fifo was 0 before.\r\nkfifo_esize (fifo)¶\r\nreturns the size of the element managed by the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nkfifo_recsize (fifo)¶\r\nreturns the size of the record length field\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nkfifo_size (fifo)¶\r\nreturns the size of the fifo in elements\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nkfifo_reset (fifo)¶\r\nremoves the entire fifo content\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nNote\r\nusage of kfifo_reset() is dangerous. It should be only called when the fifo is exclusived locked or when it is\r\nsecured that no other thread is accessing the fifo.\r\nkfifo_reset_out (fifo)¶\r\nskip fifo content\r\nParameters\r\nfifo\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 116 of 228\n\naddress of the fifo to be used\r\nNote\r\nThe usage of kfifo_reset_out() is safe until it will be only called from the reader thread and there is only one\r\nconcurrent reader. Otherwise it is dangerous and must be handled in the same way as kfifo_reset() .\r\nkfifo_len (fifo)¶\r\nreturns the number of used elements in the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nkfifo_is_empty (fifo)¶\r\nreturns true if the fifo is empty\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nkfifo_is_full (fifo)¶\r\nreturns true if the fifo is full\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nkfifo_avail (fifo)¶\r\nreturns the number of unused elements in the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nkfifo_skip (fifo)¶\r\nskip output data\r\nParameters\r\nfifo\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 117 of 228\n\naddress of the fifo to be used\r\nkfifo_peek_len (fifo)¶\r\ngets the size of the next fifo record\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nDescription\r\nThis function returns the size of the next fifo record in number of bytes.\r\nkfifo_alloc (fifo, size, gfp_mask)¶\r\ndynamically allocates a new fifo buffer\r\nParameters\r\nfifo\r\npointer to the fifo\r\nsize\r\nthe number of elements in the fifo, this must be a power of 2\r\ngfp_mask\r\nget_free_pages mask, passed to kmalloc()\r\nDescription\r\nThis macro dynamically allocates a new fifo buffer.\r\nThe numer of elements will be rounded-up to a power of 2. The fifo will be release with kfifo_free() . Return 0\r\nif no error, otherwise an error code.\r\nkfifo_free (fifo)¶\r\nfrees the fifo\r\nParameters\r\nfifo\r\nthe fifo to be freed\r\nkfifo_init (fifo, buffer, size)¶\r\ninitialize a fifo using a preallocated buffer\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 118 of 228\n\nfifo\r\nthe fifo to assign the buffer\r\nbuffer\r\nthe preallocated buffer to be used\r\nsize\r\nthe size of the internal buffer, this have to be a power of 2\r\nDescription\r\nThis macro initialize a fifo using a preallocated buffer.\r\nThe numer of elements will be rounded-up to a power of 2. Return 0 if no error, otherwise an error code.\r\nkfifo_put (fifo, val)¶\r\nput data into the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nval\r\nthe data to be added\r\nDescription\r\nThis macro copies the given value into the fifo. It returns 0 if the fifo was full. Otherwise it returns the number\r\nprocessed elements.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nkfifo_get (fifo, val)¶\r\nget data from the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nval\r\naddress where to store the data\r\nDescription\r\nThis macro reads the data from the fifo. It returns 0 if the fifo was empty. Otherwise it returns the number\r\nprocessed elements.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 119 of 228\n\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nkfifo_peek (fifo, val)¶\r\nget data from the fifo without removing\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nval\r\naddress where to store the data\r\nDescription\r\nThis reads the data from the fifo without removing it from the fifo. It returns 0 if the fifo was empty. Otherwise it\r\nreturns the number processed elements.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nkfifo_in (fifo, buf, n)¶\r\nput data into the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nbuf\r\nthe data to be added\r\nn\r\nnumber of elements to be added\r\nDescription\r\nThis macro copies the given buffer into the fifo and returns the number of copied elements.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nkfifo_in_spinlocked (fifo, buf, n, lock)¶\r\nput data into the fifo using a spinlock for locking\r\nParameters\r\nfifo\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 120 of 228\n\naddress of the fifo to be used\r\nbuf\r\nthe data to be added\r\nn\r\nnumber of elements to be added\r\nlock\r\npointer to the spinlock to use for locking\r\nDescription\r\nThis macro copies the given values buffer into the fifo and returns the number of copied elements.\r\nkfifo_out (fifo, buf, n)¶\r\nget data from the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nbuf\r\npointer to the storage buffer\r\nn\r\nmax. number of elements to get\r\nDescription\r\nThis macro get some data from the fifo and return the numbers of elements copied.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nkfifo_out_spinlocked (fifo, buf, n, lock)¶\r\nget data from the fifo using a spinlock for locking\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nbuf\r\npointer to the storage buffer\r\nn\r\nmax. number of elements to get\r\nlock\r\npointer to the spinlock to use for locking\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 121 of 228\n\nDescription\r\nThis macro get the data from the fifo and return the numbers of elements copied.\r\nkfifo_from_user (fifo, from, len, copied)¶\r\nputs some data from user space into the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nfrom\r\npointer to the data to be added\r\nlen\r\nthe length of the data to be added\r\ncopied\r\npointer to output variable to store the number of copied bytes\r\nDescription\r\nThis macro copies at most len bytes from the from into the fifo, depending of the available space and returns -\r\nEFAULT/0.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nkfifo_to_user (fifo, to, len, copied)¶\r\ncopies data from the fifo into user space\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nto\r\nwhere the data must be copied\r\nlen\r\nthe size of the destination buffer\r\ncopied\r\npointer to output variable to store the number of copied bytes\r\nDescription\r\nThis macro copies at most len bytes from the fifo into the to buffer and returns -EFAULT/0.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 122 of 228\n\nkfifo_dma_in_prepare (fifo, sgl, nents, len)¶\r\nsetup a scatterlist for DMA input\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nsgl\r\npointer to the scatterlist array\r\nnents\r\nnumber of entries in the scatterlist array\r\nlen\r\nnumber of elements to transfer\r\nDescription\r\nThis macro fills a scatterlist for DMA input. It returns the number entries in the scatterlist array.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacros.\r\nkfifo_dma_in_finish (fifo, len)¶\r\nfinish a DMA IN operation\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nlen\r\nnumber of bytes to received\r\nDescription\r\nThis macro finish a DMA IN operation. The in counter will be updated by the len parameter. No error checking\r\nwill be done.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacros.\r\nkfifo_dma_out_prepare (fifo, sgl, nents, len)¶\r\nsetup a scatterlist for DMA output\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 123 of 228\n\nsgl\r\npointer to the scatterlist array\r\nnents\r\nnumber of entries in the scatterlist array\r\nlen\r\nnumber of elements to transfer\r\nDescription\r\nThis macro fills a scatterlist for DMA output which at most len bytes to transfer. It returns the number entries in\r\nthe scatterlist array. A zero means there is no space available and the scatterlist is not filled.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacros.\r\nkfifo_dma_out_finish (fifo, len)¶\r\nfinish a DMA OUT operation\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nlen\r\nnumber of bytes transferred\r\nDescription\r\nThis macro finish a DMA OUT operation. The out counter will be updated by the len parameter. No error\r\nchecking will be done.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacros.\r\nkfifo_out_peek (fifo, buf, n)¶\r\ngets some data from the fifo\r\nParameters\r\nfifo\r\naddress of the fifo to be used\r\nbuf\r\npointer to the storage buffer\r\nn\r\nmax. number of elements to get\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 124 of 228\n\nThis macro get the data from the fifo and return the numbers of elements copied. The data is not removed from the\r\nfifo.\r\nNote that with only one concurrent reader and one concurrent writer, you don’t need extra locking to use these\r\nmacro.\r\nrelay interface support¶\r\nRelay interface support is designed to provide an efficient mechanism for tools and facilities to relay large\r\namounts of data from kernel space to user space.\r\nrelay interface¶\r\nint relay_buf_full (struct rchan_buf * buf)¶\r\nboolean, is the channel buffer full?\r\nParameters\r\nstruct rchan_buf * buf\r\nchannel buffer\r\nDescription\r\nReturns 1 if the buffer is full, 0 otherwise.\r\nvoid relay_reset (struct rchan * chan)¶\r\nreset the channel\r\nParameters\r\nstruct rchan * chan\r\nthe channel\r\nDescription\r\nThis has the effect of erasing all data from all channel buffers and restarting the channel in its initial\r\nstate. The buffers are not freed, so any mappings are still in effect.\r\nNOTE. Care should be taken that the channel isn’t actually being used by anything when this call is\r\nmade.\r\nstruct rchan * relay_open (const char * base_filename, struct dentry * parent, size_t subbuf_size,\r\nsize_t n_subbufs, struct rchan_callbacks * cb, void * private_data)¶\r\ncreate a new relay channel\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 125 of 228\n\nconst char * base_filename\r\nbase name of files to create, NULL for buffering only\r\nstruct dentry * parent\r\ndentry of parent directory, NULL for root directory or buffer\r\nsize_t subbuf_size\r\nsize of sub-buffers\r\nsize_t n_subbufs\r\nnumber of sub-buffers\r\nstruct rchan_callbacks * cb\r\nclient callback functions\r\nvoid * private_data\r\nuser-defined data\r\nDescription\r\nReturns channel pointer if successful, NULL otherwise.\r\nCreates a channel buffer for each cpu using the sizes and attributes specified. The created channel\r\nbuffer files will be named base_filename0...base_filenameN-1. File permissions will be S_IRUSR .\r\nIf opening a buffer (parent = NULL) that you later wish to register in a filesystem, call\r\nrelay_late_setup_files() once the parent dentry is available.\r\nint relay_late_setup_files (struct rchan * chan, const char * base_filename, struct dentry * parent)¶\r\ntriggers file creation\r\nParameters\r\nstruct rchan * chan\r\nchannel to operate on\r\nconst char * base_filename\r\nbase name of files to create\r\nstruct dentry * parent\r\ndentry of parent directory, NULL for root directory\r\nDescription\r\nReturns 0 if successful, non-zero otherwise.\r\nUse to setup files for a previously buffer-only channel created by relay_open() with a NULL parent\r\ndentry.\r\nFor example, this is useful for perfomring early tracing in kernel, before VFS is up and then exposing\r\nthe early results once the dentry is available.\r\nsize_t relay_switch_subbuf (struct rchan_buf * buf, size_t length)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 126 of 228\n\nswitch to a new sub-buffer\r\nParameters\r\nstruct rchan_buf * buf\r\nchannel buffer\r\nsize_t length\r\nsize of current event\r\nDescription\r\nReturns either the length passed in or 0 if full.\r\nPerforms sub-buffer-switch tasks such as invoking callbacks, updating padding counts, waking up\r\nreaders, etc.\r\nvoid relay_subbufs_consumed (struct rchan * chan, unsigned int cpu, size_t subbufs_consumed)¶\r\nupdate the buffer’s sub-buffers-consumed count\r\nParameters\r\nstruct rchan * chan\r\nthe channel\r\nunsigned int cpu\r\nthe cpu associated with the channel buffer to update\r\nsize_t subbufs_consumed\r\nnumber of sub-buffers to add to current buf’s count\r\nDescription\r\nAdds to the channel buffer’s consumed sub-buffer count. subbufs_consumed should be the number of\r\nsub-buffers newly consumed, not the total consumed.\r\nNOTE. Kernel clients don’t need to call this function if the channel mode is ‘overwrite’.\r\nvoid relay_close (struct rchan * chan)¶\r\nclose the channel\r\nParameters\r\nstruct rchan * chan\r\nthe channel\r\nDescription\r\nCloses all channel buffers and frees the channel.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 127 of 228\n\nvoid relay_flush (struct rchan * chan)¶\r\nclose the channel\r\nParameters\r\nstruct rchan * chan\r\nthe channel\r\nDescription\r\nFlushes all channel buffers, i.e. forces buffer switch.\r\nint relay_mmap_buf (struct rchan_buf * buf, struct vm_area_struct * vma)¶\r\nmmap channel buffer to process address space\r\nParameters\r\nstruct rchan_buf * buf\r\nrelay channel buffer\r\nstruct vm_area_struct * vma\r\nvm_area_struct describing memory to be mapped\r\nDescription\r\nReturns 0 if ok, negative on error\r\nCaller should already have grabbed mmap_sem.\r\nvoid * relay_alloc_buf (struct rchan_buf * buf, size_t * size)¶\r\nallocate a channel buffer\r\nParameters\r\nstruct rchan_buf * buf\r\nthe buffer struct\r\nsize_t * size\r\ntotal size of the buffer\r\nDescription\r\nReturns a pointer to the resulting buffer, NULL if unsuccessful. The passed in size will get page\r\naligned, if it isn’t already.\r\nstruct rchan_buf * relay_create_buf (struct rchan * chan)¶\r\nallocate and initialize a channel buffer\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 128 of 228\n\nParameters\r\nstruct rchan * chan\r\nthe relay channel\r\nDescription\r\nReturns channel buffer if successful, NULL otherwise.\r\nvoid relay_destroy_channel (struct kref * kref)¶\r\nfree the channel struct\r\nParameters\r\nstruct kref * kref\r\ntarget kernel reference that contains the relay channel\r\nDescription\r\nShould only be called from kref_put() .\r\nvoid relay_destroy_buf (struct rchan_buf * buf)¶\r\ndestroy an rchan_buf struct and associated buffer\r\nParameters\r\nstruct rchan_buf * buf\r\nthe buffer struct\r\nvoid relay_remove_buf (struct kref * kref)¶\r\nremove a channel buffer\r\nParameters\r\nstruct kref * kref\r\ntarget kernel reference that contains the relay buffer\r\nDescription\r\nRemoves the file from the filesystem, which also frees the rchan_buf_struct and the channel buffer.\r\nShould only be called from kref_put() .\r\nint relay_buf_empty (struct rchan_buf * buf)¶\r\nboolean, is the channel buffer empty?\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 129 of 228\n\nstruct rchan_buf * buf\r\nchannel buffer\r\nDescription\r\nReturns 1 if the buffer is empty, 0 otherwise.\r\nvoid wakeup_readers (struct irq_work * work)¶\r\nwake up readers waiting on a channel\r\nParameters\r\nstruct irq_work * work\r\ncontains the channel buffer\r\nDescription\r\nThis is the function used to defer reader waking\r\nvoid __relay_reset (struct rchan_buf * buf, unsigned int init)¶\r\nreset a channel buffer\r\nParameters\r\nstruct rchan_buf * buf\r\nthe channel buffer\r\nunsigned int init\r\n1 if this is a first-time initialization\r\nDescription\r\nSee relay_reset() for description of effect.\r\nvoid relay_close_buf (struct rchan_buf * buf)¶\r\nclose a channel buffer\r\nParameters\r\nstruct rchan_buf * buf\r\nchannel buffer\r\nDescription\r\nMarks the buffer finalized and restores the default callbacks. The channel buffer and channel buffer data\r\nstructure are then freed automatically when the last reference is given up.\r\nint relay_file_open (struct inode * inode, struct file * filp)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 130 of 228\n\nopen file op for relay files\r\nParameters\r\nstruct inode * inode\r\nthe inode\r\nstruct file * filp\r\nthe file\r\nDescription\r\nIncrements the channel buffer refcount.\r\nint relay_file_mmap (struct file * filp, struct vm_area_struct * vma)¶\r\nmmap file op for relay files\r\nParameters\r\nstruct file * filp\r\nthe file\r\nstruct vm_area_struct * vma\r\nthe vma describing what to map\r\nDescription\r\nCalls upon relay_mmap_buf() to map the file into user space.\r\nunsigned int relay_file_poll (struct file * filp, poll_table * wait)¶\r\npoll file op for relay files\r\nParameters\r\nstruct file * filp\r\nthe file\r\npoll_table * wait\r\npoll table\r\nDescription\r\nPoll implemention.\r\nint relay_file_release (struct inode * inode, struct file * filp)¶\r\nrelease file op for relay files\r\nParameters\r\nstruct inode * inode\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 131 of 228\n\nthe inode\r\nstruct file * filp\r\nthe file\r\nDescription\r\nDecrements the channel refcount, as the filesystem is no longer using it.\r\nsize_t relay_file_read_subbuf_avail (size_t read_pos, struct rchan_buf * buf)¶\r\nreturn bytes available in sub-buffer\r\nParameters\r\nsize_t read_pos\r\nfile read position\r\nstruct rchan_buf * buf\r\nrelay channel buffer\r\nsize_t relay_file_read_start_pos (size_t read_pos, struct rchan_buf * buf)¶\r\nfind the first available byte to read\r\nParameters\r\nsize_t read_pos\r\nfile read position\r\nstruct rchan_buf * buf\r\nrelay channel buffer\r\nDescription\r\nIf the read_pos is in the middle of padding, return the position of the first actually available byte,\r\notherwise return the original value.\r\nsize_t relay_file_read_end_pos (struct rchan_buf * buf, size_t read_pos, size_t count)¶\r\nreturn the new read position\r\nParameters\r\nstruct rchan_buf * buf\r\nrelay channel buffer\r\nsize_t read_pos\r\nfile read position\r\nsize_t count\r\nnumber of bytes to be read\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 132 of 228\n\nModule Support¶\r\nModule Loading¶\r\nint __request_module (bool wait, const char * fmt, ...)¶\r\ntry to load a kernel module\r\nParameters\r\nbool wait\r\nwait (or not) for the operation to complete\r\nconst char * fmt\r\nprintf style format string for the name of the module\r\n...\r\narguments as specified in the format string\r\nDescription\r\nLoad a module using the user mode module loader. The function returns zero on success or a negative errno code\r\nor positive exit code from “modprobe” on failure. Note that a successful module load does not mean the module\r\ndid not then unload and exit on an error of its own. Callers must check that the service they requested is now\r\navailable not blindly invoke it.\r\nIf module auto-loading support is disabled then this function becomes a no-operation.\r\nstruct subprocess_info * call_usermodehelper_setup (const char * path, char ** argv, char ** envp,\r\ngfp_t gfp_mask, int (*init) (struct subprocess_info *info, struct cred *new, void (*cleanup) (struct\r\nsubprocess_info *info, void * data)¶\r\nprepare to call a usermode helper\r\nParameters\r\nconst char * path\r\npath to usermode executable\r\nchar ** argv\r\narg vector for process\r\nchar ** envp\r\nenvironment for process\r\ngfp_t gfp_mask\r\ngfp mask for memory allocation\r\nint (*)(struct subprocess_info *info, struct cred *new) init\r\nan init function\r\nvoid (*)(struct subprocess_info *info) cleanup\r\na cleanup function\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 133 of 228\n\nvoid * data\r\narbitrary context sensitive data\r\nDescription\r\nReturns either NULL on allocation failure, or a subprocess_info structure. This should be passed to\r\ncall_usermodehelper_exec to exec the process and free the structure.\r\nThe init function is used to customize the helper process prior to exec. A non-zero return code causes the process\r\nto error out, exit, and return the failure to the calling process\r\nThe cleanup function is just before ethe subprocess_info is about to be freed. This can be used for freeing the argv\r\nand envp. The Function must be runnable in either a process context or the context in which\r\ncall_usermodehelper_exec is called.\r\nint call_usermodehelper_exec (struct subprocess_info * sub_info, int wait)¶\r\nstart a usermode application\r\nParameters\r\nstruct subprocess_info * sub_info\r\ninformation about the subprocessa\r\nint wait\r\nwait for the application to finish and return status. when UMH_NO_WAIT don’t wait at all, but you get no\r\nuseful error back when the program couldn’t be exec’ed. This makes it safe to call from interrupt context.\r\nDescription\r\nRuns a user-space application. The application is started asynchronously if wait is not set, and runs as a child of\r\nsystem workqueues. (ie. it runs with full root capabilities and optimized affinity).\r\nint call_usermodehelper (const char * path, char ** argv, char ** envp, int wait)¶\r\nprepare and start a usermode application\r\nParameters\r\nconst char * path\r\npath to usermode executable\r\nchar ** argv\r\narg vector for process\r\nchar ** envp\r\nenvironment for process\r\nint wait\r\nwait for the application to finish and return status. when UMH_NO_WAIT don’t wait at all, but you get no\r\nuseful error back when the program couldn’t be exec’ed. This makes it safe to call from interrupt context.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 134 of 228\n\nDescription\r\nThis function is the equivalent to use call_usermodehelper_setup() and call_usermodehelper_exec() .\r\nInter Module support¶\r\nRefer to the file kernel/module.c for more information.\r\nHardware Interfaces¶\r\nInterrupt Handling¶\r\nbool synchronize_hardirq (unsigned int irq)¶\r\nwait for pending hard IRQ handlers (on other CPUs)\r\nParameters\r\nunsigned int irq\r\ninterrupt number to wait for\r\nDescription\r\nThis function waits for any pending hard IRQ handlers for this interrupt to complete before returning. If\r\nyou use this function while holding a resource the IRQ handler may need you will deadlock. It does not\r\ntake associated threaded handlers into account.\r\nDo not use this for shutdown scenarios where you must be sure that all parts (hardirq and threaded\r\nhandler) have completed.\r\nReturn\r\nfalse if a threaded handler is active.\r\nThis function may be called - with care - from IRQ context.\r\nvoid synchronize_irq (unsigned int irq)¶\r\nwait for pending IRQ handlers (on other CPUs)\r\nParameters\r\nunsigned int irq\r\ninterrupt number to wait for\r\nDescription\r\nThis function waits for any pending IRQ handlers for this interrupt to complete before returning. If you\r\nuse this function while holding a resource the IRQ handler may need you will deadlock.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 135 of 228\n\nThis function may be called - with care - from IRQ context.\r\nint irq_set_affinity_notifier (unsigned int irq, struct irq_affinity_notify * notify)¶\r\ncontrol notification of IRQ affinity changes\r\nParameters\r\nunsigned int irq\r\nInterrupt for which to enable/disable notification\r\nstruct irq_affinity_notify * notify\r\nContext for notification, or NULL to disable notification. Function pointers must be initialised; the other\r\nfields will be initialised by this function.\r\nDescription\r\nMust be called in process context. Notification may only be enabled after the IRQ is allocated and must\r\nbe disabled before the IRQ is freed using free_irq() .\r\nint irq_set_vcpu_affinity (unsigned int irq, void * vcpu_info)¶\r\nSet vcpu affinity for the interrupt\r\nParameters\r\nunsigned int irq\r\ninterrupt number to set affinity\r\nvoid * vcpu_info\r\nvCPU specific data\r\nDescription\r\nThis function uses the vCPU specific data to set the vCPU affinity for an irq. The vCPU specific data is\r\npassed from outside, such as KVM. One example code path is as below: KVM -\u003e IOMMU -\u003e\r\nirq_set_vcpu_affinity() .\r\nvoid disable_irq_nosync (unsigned int irq)¶\r\ndisable an irq without waiting\r\nParameters\r\nunsigned int irq\r\nInterrupt to disable\r\nDescription\r\nDisable the selected interrupt line. Disables and Enables are nested. Unlike disable_irq() , this\r\nfunction does not ensure existing instances of the IRQ handler have completed before returning.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 136 of 228\n\nThis function may be called from IRQ context.\r\nvoid disable_irq (unsigned int irq)¶\r\ndisable an irq and wait for completion\r\nParameters\r\nunsigned int irq\r\nInterrupt to disable\r\nDescription\r\nDisable the selected interrupt line. Enables and Disables are nested. This function waits for any pending\r\nIRQ handlers for this interrupt to complete before returning. If you use this function while holding a\r\nresource the IRQ handler may need you will deadlock.\r\nThis function may be called - with care - from IRQ context.\r\nbool disable_hardirq (unsigned int irq)¶\r\ndisables an irq and waits for hardirq completion\r\nParameters\r\nunsigned int irq\r\nInterrupt to disable\r\nDescription\r\nDisable the selected interrupt line. Enables and Disables are nested. This function waits for any pending\r\nhard IRQ handlers for this interrupt to complete before returning. If you use this function while holding\r\na resource the hard IRQ handler may need you will deadlock.\r\nWhen used to optimistically disable an interrupt from atomic context the return value must be checked.\r\nReturn\r\nfalse if a threaded handler is active.\r\nThis function may be called - with care - from IRQ context.\r\nvoid enable_irq (unsigned int irq)¶\r\nenable handling of an irq\r\nParameters\r\nunsigned int irq\r\nInterrupt to enable\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 137 of 228\n\nDescription\r\nUndoes the effect of one call to disable_irq() . If this matches the last disable, processing of\r\ninterrupts on this IRQ line is re-enabled.\r\nThis function may be called from IRQ context only when desc-\u003eirq_data.chip-\u003ebus_lock and desc-\r\n\u003echip-\u003ebus_sync_unlock are NULL !\r\nint irq_set_irq_wake (unsigned int irq, unsigned int on)¶\r\ncontrol irq power management wakeup\r\nParameters\r\nunsigned int irq\r\ninterrupt to control\r\nunsigned int on\r\nenable/disable power management wakeup\r\nDescription\r\nEnable/disable power management wakeup mode, which is disabled by default. Enables and disables\r\nmust match, just as they match for non-wakeup mode support.\r\nWakeup mode lets this IRQ wake the system from sleep states like “suspend to RAM”.\r\nvoid irq_wake_thread (unsigned int irq, void * dev_id)¶\r\nwake the irq thread for the action identified by dev_id\r\nParameters\r\nunsigned int irq\r\nInterrupt line\r\nvoid * dev_id\r\nDevice identity for which the thread should be woken\r\nint setup_irq (unsigned int irq, struct irqaction * act)¶\r\nsetup an interrupt\r\nParameters\r\nunsigned int irq\r\nInterrupt line to setup\r\nstruct irqaction * act\r\nirqaction for the interrupt\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 138 of 228\n\nUsed to statically setup interrupts in the early boot process.\r\nvoid remove_irq (unsigned int irq, struct irqaction * act)¶\r\nfree an interrupt\r\nParameters\r\nunsigned int irq\r\nInterrupt line to free\r\nstruct irqaction * act\r\nirqaction for the interrupt\r\nDescription\r\nUsed to remove interrupts statically setup by the early boot process.\r\nconst void * free_irq (unsigned int irq, void * dev_id)¶\r\nfree an interrupt allocated with request_irq\r\nParameters\r\nunsigned int irq\r\nInterrupt line to free\r\nvoid * dev_id\r\nDevice identity to free\r\nDescription\r\nRemove an interrupt handler. The handler is removed and if the interrupt line is no longer in use by any\r\ndriver it is disabled. On a shared IRQ the caller must ensure the interrupt is disabled on the card it\r\ndrives before calling this function. The function does not return until any executing interrupts for this\r\nIRQ have completed.\r\nThis function must not be called from interrupt context.\r\nReturns the devname argument passed to request_irq.\r\nint request_threaded_irq (unsigned int irq, irq_handler_t handler, irq_handler_t thread_fn, unsigned\r\nlong irqflags, const char * devname, void * dev_id)¶\r\nallocate an interrupt line\r\nParameters\r\nunsigned int irq\r\nInterrupt line to allocate\r\nirq_handler_t handler\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 139 of 228\n\nFunction to be called when the IRQ occurs. Primary handler for threaded interrupts If NULL and thread_fn\r\n!= NULL the default primary handler is installed\r\nirq_handler_t thread_fn\r\nFunction called from the irq handler thread If NULL, no irq thread is created\r\nunsigned long irqflags\r\nInterrupt type flags\r\nconst char * devname\r\nAn ascii name for the claiming device\r\nvoid * dev_id\r\nA cookie passed back to the handler function\r\nDescription\r\nThis call allocates interrupt resources and enables the interrupt line and IRQ handling. From the point\r\nthis call is made your handler function may be invoked. Since your handler function must clear any\r\ninterrupt the board raises, you must take care both to initialise your hardware and to set up the interrupt\r\nhandler in the right order.\r\nIf you want to set up a threaded irq handler for your device then you need to supply handler and\r\nthread_fn. handler is still called in hard interrupt context and has to check whether the interrupt\r\noriginates from the device. If yes it needs to disable the interrupt on the device and return\r\nIRQ_WAKE_THREAD which will wake up the handler thread and run thread_fn. This split handler\r\ndesign is necessary to support shared interrupts.\r\nDev_id must be globally unique. Normally the address of the device data structure is used as the cookie.\r\nSince the handler receives this value it makes sense to use it.\r\nIf your interrupt is shared you must pass a non NULL dev_id as this is required when freeing the\r\ninterrupt.\r\nFlags:\r\nIRQF_SHARED Interrupt is shared IRQF_TRIGGER_* Specify active edge(s) or level\r\nint request_any_context_irq (unsigned int irq, irq_handler_t handler, unsigned long flags, const char * name,\r\nvoid * dev_id)¶\r\nallocate an interrupt line\r\nParameters\r\nunsigned int irq\r\nInterrupt line to allocate\r\nirq_handler_t handler\r\nFunction to be called when the IRQ occurs. Threaded handler for threaded interrupts.\r\nunsigned long flags\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 140 of 228\n\nInterrupt type flags\r\nconst char * name\r\nAn ascii name for the claiming device\r\nvoid * dev_id\r\nA cookie passed back to the handler function\r\nDescription\r\nThis call allocates interrupt resources and enables the interrupt line and IRQ handling. It selects either a\r\nhardirq or threaded handling method depending on the context.\r\nOn failure, it returns a negative value. On success, it returns either IRQC_IS_HARDIRQ or\r\nIRQC_IS_NESTED.\r\nbool irq_percpu_is_enabled (unsigned int irq)¶\r\nCheck whether the per cpu irq is enabled\r\nParameters\r\nunsigned int irq\r\nLinux irq number to check for\r\nDescription\r\nMust be called from a non migratable context. Returns the enable state of a per cpu interrupt on the current cpu.\r\nvoid free_percpu_irq (unsigned int irq, void __percpu * dev_id)¶\r\nfree an interrupt allocated with request_percpu_irq\r\nParameters\r\nunsigned int irq\r\nInterrupt line to free\r\nvoid __percpu * dev_id\r\nDevice identity to free\r\nDescription\r\nRemove a percpu interrupt handler. The handler is removed, but the interrupt line is not disabled. This\r\nmust be done on each CPU before calling this function. The function does not return until any executing\r\ninterrupts for this IRQ have completed.\r\nThis function must not be called from interrupt context.\r\nint request_percpu_irq (unsigned int irq, irq_handler_t handler, const char * devname, void __percpu\r\n* dev_id)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 141 of 228\n\nallocate a percpu interrupt line\r\nParameters\r\nunsigned int irq\r\nInterrupt line to allocate\r\nirq_handler_t handler\r\nFunction to be called when the IRQ occurs.\r\nconst char * devname\r\nAn ascii name for the claiming device\r\nvoid __percpu * dev_id\r\nA percpu cookie passed back to the handler function\r\nDescription\r\nThis call allocates interrupt resources and enables the interrupt on the local CPU. If the interrupt is\r\nsupposed to be enabled on other CPUs, it has to be done on each CPU using enable_percpu_irq() .\r\nDev_id must be globally unique. It is a per-cpu variable, and the handler gets called with the interrupted\r\nCPU’s instance of that variable.\r\nint irq_get_irqchip_state (unsigned int irq, enum irqchip_irq_state which, bool * state)¶\r\nreturns the irqchip state of a interrupt.\r\nParameters\r\nunsigned int irq\r\nInterrupt line that is forwarded to a VM\r\nenum irqchip_irq_state which\r\nOne of IRQCHIP_STATE_* the caller wants to know about\r\nbool * state\r\na pointer to a boolean where the state is to be storeed\r\nDescription\r\nThis call snapshots the internal irqchip state of an interrupt, returning into state the bit corresponding to\r\nstage which\r\nThis function should be called with preemption disabled if the interrupt controller has per-cpu registers.\r\nint irq_set_irqchip_state (unsigned int irq, enum irqchip_irq_state which, bool val)¶\r\nset the state of a forwarded interrupt.\r\nParameters\r\nunsigned int irq\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 142 of 228\n\nInterrupt line that is forwarded to a VM\r\nenum irqchip_irq_state which\r\nState to be restored (one of IRQCHIP_STATE_*)\r\nbool val\r\nValue corresponding to which\r\nDescription\r\nThis call sets the internal irqchip state of an interrupt, depending on the value of which.\r\nThis function should be called with preemption disabled if the interrupt controller has per-cpu registers.\r\nDMA Channels¶\r\nint request_dma (unsigned int dmanr, const char * device_id)¶\r\nrequest and reserve a system DMA channel\r\nParameters\r\nunsigned int dmanr\r\nDMA channel number\r\nconst char * device_id\r\nreserving device ID string, used in /proc/dma\r\nvoid free_dma (unsigned int dmanr)¶\r\nfree a reserved system DMA channel\r\nParameters\r\nunsigned int dmanr\r\nDMA channel number\r\nResources Management¶\r\nstruct resource * request_resource_conflict (struct resource * root, struct resource * new)¶\r\nrequest and reserve an I/O or memory resource\r\nParameters\r\nstruct resource * root\r\nroot resource descriptor\r\nstruct resource * new\r\nresource descriptor desired by caller\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 143 of 228\n\nReturns 0 for success, conflict resource on error.\r\nint reallocate_resource (struct resource * root, struct resource * old, resource_size_t newsize, struct\r\nresource_constraint * constraint)¶\r\nallocate a slot in the resource tree given range \u0026 alignment. The resource will be relocated if the new size\r\ncannot be reallocated in the current location.\r\nParameters\r\nstruct resource * root\r\nroot resource descriptor\r\nstruct resource * old\r\nresource descriptor desired by caller\r\nresource_size_t newsize\r\nnew size of the resource descriptor\r\nstruct resource_constraint * constraint\r\nthe size and alignment constraints to be met.\r\nstruct resource * lookup_resource (struct resource * root, resource_size_t start)¶\r\nfind an existing resource by a resource start address\r\nParameters\r\nstruct resource * root\r\nroot resource descriptor\r\nresource_size_t start\r\nresource start address\r\nDescription\r\nReturns a pointer to the resource if found, NULL otherwise\r\nstruct resource * insert_resource_conflict (struct resource * parent, struct resource * new)¶\r\nInserts resource in the resource tree\r\nParameters\r\nstruct resource * parent\r\nparent of the new resource\r\nstruct resource * new\r\nnew resource to insert\r\nDescription\r\nReturns 0 on success, conflict resource if the resource can’t be inserted.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 144 of 228\n\nThis function is equivalent to request_resource_conflict when no conflict happens. If a conflict happens, and the\r\nconflicting resources entirely fit within the range of the new resource, then the new resource is inserted and the\r\nconflicting resources become children of the new resource.\r\nThis function is intended for producers of resources, such as FW modules and bus drivers.\r\nvoid insert_resource_expand_to_fit (struct resource * root, struct resource * new)¶\r\nInsert a resource into the resource tree\r\nParameters\r\nstruct resource * root\r\nroot resource descriptor\r\nstruct resource * new\r\nnew resource to insert\r\nDescription\r\nInsert a resource into the resource tree, possibly expanding it in order to make it encompass any conflicting\r\nresources.\r\nresource_size_t resource_alignment (struct resource * res)¶\r\ncalculate resource’s alignment\r\nParameters\r\nstruct resource * res\r\nresource pointer\r\nDescription\r\nReturns alignment on success, 0 (invalid alignment) on failure.\r\nint release_mem_region_adjustable (struct resource * parent, resource_size_t start, resource_size_t size)¶\r\nrelease a previously reserved memory region\r\nParameters\r\nstruct resource * parent\r\nparent resource descriptor\r\nresource_size_t start\r\nresource start address\r\nresource_size_t size\r\nresource region size\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 145 of 228\n\nThis interface is intended for memory hot-delete. The requested region is released from a currently busy memory\r\nresource. The requested region must either match exactly or fit into a single busy resource entry. In the latter case,\r\nthe remaining resource is adjusted accordingly. Existing children of the busy memory resource must be immutable\r\nin the request.\r\nNote\r\nAdditional release conditions, such as overlapping region, can be supported after they are confirmed as\r\nvalid cases.\r\nWhen a busy memory resource gets split into two entries, the code assumes that all children remain in the\r\nlower address entry for simplicity. Enhance this logic when necessary.\r\nint request_resource (struct resource * root, struct resource * new)¶\r\nrequest and reserve an I/O or memory resource\r\nParameters\r\nstruct resource * root\r\nroot resource descriptor\r\nstruct resource * new\r\nresource descriptor desired by caller\r\nDescription\r\nReturns 0 for success, negative error code on error.\r\nint release_resource (struct resource * old)¶\r\nrelease a previously reserved resource\r\nParameters\r\nstruct resource * old\r\nresource pointer\r\nint region_intersects (resource_size_t start, size_t size, unsigned long flags, unsigned long desc)¶\r\ndetermine intersection of region with known resources\r\nParameters\r\nresource_size_t start\r\nregion start address\r\nsize_t size\r\nsize of region\r\nunsigned long flags\r\nflags of resource (in iomem_resource)\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 146 of 228\n\nunsigned long desc\r\ndescriptor of resource (in iomem_resource) or IORES_DESC_NONE\r\nDescription\r\nCheck if the specified region partially overlaps or fully eclipses a resource identified by flags and desc (optional\r\nwith IORES_DESC_NONE). Return REGION_DISJOINT if the region does not overlap flags/desc, return\r\nREGION_MIXED if the region overlaps flags/desc and another resource, and return REGION_INTERSECTS if\r\nthe region overlaps flags/desc and no other defined resource. Note that REGION_INTERSECTS is also returned\r\nin the case when the specified region overlaps RAM and undefined memory holes.\r\nregion_intersect() is used by memory remapping functions to ensure the user is not remapping RAM and is a\r\nvast speed up over walking through the resource table page by page.\r\nint allocate_resource (struct resource * root, struct resource * new, resource_size_t size, resource_size_t min,\r\nresource_size_t max, resource_size_t align, resource_size_t (*alignf) (void *, const struct resource *,\r\nresource_size_t, resource_size_t, void * alignf_data)¶\r\nallocate empty slot in the resource tree given range \u0026 alignment. The resource will be reallocated with a\r\nnew size if it was already allocated\r\nParameters\r\nstruct resource * root\r\nroot resource descriptor\r\nstruct resource * new\r\nresource descriptor desired by caller\r\nresource_size_t size\r\nrequested resource region size\r\nresource_size_t min\r\nminimum boundary to allocate\r\nresource_size_t max\r\nmaximum boundary to allocate\r\nresource_size_t align\r\nalignment requested, in bytes\r\nresource_size_t (*)(void *, const struct resource *, resource_size_t, resource_size_t) alignf\r\nalignment function, optional, called if not NULL\r\nvoid * alignf_data\r\narbitrary data to pass to the alignf function\r\nint insert_resource (struct resource * parent, struct resource * new)¶\r\nInserts a resource in the resource tree\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 147 of 228\n\nstruct resource * parent\r\nparent of the new resource\r\nstruct resource * new\r\nnew resource to insert\r\nDescription\r\nReturns 0 on success, -EBUSY if the resource can’t be inserted.\r\nThis function is intended for producers of resources, such as FW modules and bus drivers.\r\nint remove_resource (struct resource * old)¶\r\nRemove a resource in the resource tree\r\nParameters\r\nstruct resource * old\r\nresource to remove\r\nDescription\r\nReturns 0 on success, -EINVAL if the resource is not valid.\r\nThis function removes a resource previously inserted by insert_resource() or insert_resource_conflict() ,\r\nand moves the children (if any) up to where they were before. insert_resource() and\r\ninsert_resource_conflict() insert a new resource, and move any conflicting resources down to the children of\r\nthe new resource.\r\ninsert_resource() , insert_resource_conflict() and remove_resource() are intended for producers of\r\nresources, such as FW modules and bus drivers.\r\nint adjust_resource (struct resource * res, resource_size_t start, resource_size_t size)¶\r\nmodify a resource’s start and size\r\nParameters\r\nstruct resource * res\r\nresource to modify\r\nresource_size_t start\r\nnew start value\r\nresource_size_t size\r\nnew size\r\nDescription\r\nGiven an existing resource, change its start and size to match the arguments. Returns 0 on success, -EBUSY if it\r\ncan’t fit. Existing children of the resource are assumed to be immutable.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 148 of 228\n\nstruct resource * __request_region (struct resource * parent, resource_size_t start, resource_size_t n, const char\r\n* name, int flags)¶\r\ncreate a new busy resource region\r\nParameters\r\nstruct resource * parent\r\nparent resource descriptor\r\nresource_size_t start\r\nresource start address\r\nresource_size_t n\r\nresource region size\r\nconst char * name\r\nreserving caller’s ID string\r\nint flags\r\nIO resource flags\r\nvoid __release_region (struct resource * parent, resource_size_t start, resource_size_t n)¶\r\nrelease a previously reserved resource region\r\nParameters\r\nstruct resource * parent\r\nparent resource descriptor\r\nresource_size_t start\r\nresource start address\r\nresource_size_t n\r\nresource region size\r\nDescription\r\nThe described resource region must match a currently busy region.\r\nint devm_request_resource (struct device * dev, struct resource * root, struct resource * new)¶\r\nrequest and reserve an I/O or memory resource\r\nParameters\r\nstruct device * dev\r\ndevice for which to request the resource\r\nstruct resource * root\r\nroot of the resource tree from which to request the resource\r\nstruct resource * new\r\ndescriptor of the resource to request\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 149 of 228\n\nDescription\r\nThis is a device-managed version of request_resource() . There is usually no need to release resources\r\nrequested by this function explicitly since that will be taken care of when the device is unbound from its driver. If\r\nfor some reason the resource needs to be released explicitly, because of ordering issues for example, drivers must\r\ncall devm_release_resource() rather than the regular release_resource() .\r\nWhen a conflict is detected between any existing resources and the newly requested resource, an error message\r\nwill be printed.\r\nReturns 0 on success or a negative error code on failure.\r\nvoid devm_release_resource (struct device * dev, struct resource * new)¶\r\nrelease a previously requested resource\r\nParameters\r\nstruct device * dev\r\ndevice for which to release the resource\r\nstruct resource * new\r\ndescriptor of the resource to release\r\nDescription\r\nReleases a resource previously requested using devm_request_resource() .\r\nMTRR Handling¶\r\nint arch_phys_wc_add (unsigned long base, unsigned long size)¶\r\nadd a WC MTRR and handle errors if PAT is unavailable\r\nParameters\r\nunsigned long base\r\nPhysical base address\r\nunsigned long size\r\nSize of region\r\nDescription\r\nIf PAT is available, this does nothing. If PAT is unavailable, it attempts to add a WC MTRR covering size bytes\r\nstarting at base and logs an error if this fails.\r\nThe called should provide a power of two size on an equivalent power of two boundary.\r\nDrivers must store the return value to pass to mtrr_del_wc_if_needed, but drivers should not try to interpret that\r\nreturn value.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 150 of 228\n\nSecurity Framework¶\r\nint security_init (void)¶\r\ninitializes the security framework\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nThis should be called early in the kernel initialization sequence.\r\nint security_module_enable (const char * module)¶\r\nLoad given security module on boot ?\r\nParameters\r\nconst char * module\r\nthe name of the module\r\nDescription\r\nEach LSM must pass this method before registering its own operations to avoid security registration races. This\r\nmethod may also be used to check if your LSM is currently loaded during kernel initialization.\r\nReturn\r\ntrue if:\r\nThe passed LSM is the one chosen by user at boot time,\r\nor the passed LSM is configured as the default and the user did not choose an alternate LSM at boot time.\r\nOtherwise, return false.\r\nvoid security_add_hooks (struct security_hook_list * hooks, int count, char * lsm)¶\r\nAdd a modules hooks to the hook lists.\r\nParameters\r\nstruct security_hook_list * hooks\r\nthe hooks to add\r\nint count\r\nthe number of hooks to add\r\nchar * lsm\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 151 of 228\n\nthe name of the security module\r\nDescription\r\nEach LSM has to register its hooks with the infrastructure.\r\nstruct dentry * securityfs_create_file (const char * name, umode_t mode, struct dentry * parent, void * data,\r\nconst struct file_operations * fops)¶\r\ncreate a file in the securityfs filesystem\r\nParameters\r\nconst char * name\r\na pointer to a string containing the name of the file to create.\r\numode_t mode\r\nthe permission that the file should have\r\nstruct dentry * parent\r\na pointer to the parent dentry for this file. This should be a directory dentry if set. If this parameter is\r\nNULL , then the file will be created in the root of the securityfs filesystem.\r\nvoid * data\r\na pointer to something that the caller will want to get to later on. The inode.i_private pointer will point to\r\nthis value on the open() call.\r\nconst struct file_operations * fops\r\na pointer to a struct file_operations that should be used for this file.\r\nDescription\r\nThis is the basic “create a file” function for securityfs. It allows for a wide range of flexibility in creating a file, or\r\na directory (if you want to create a directory, the securityfs_create_dir() function is recommended to be used\r\ninstead).\r\nThis function returns a pointer to a dentry if it succeeds. This pointer must be passed to the\r\nsecurityfs_remove() function when the file is to be removed (no automatic cleanup happens if your module is\r\nunloaded, you are responsible here). If an error occurs, the function will return the error value (via ERR_PTR).\r\nIf securityfs is not enabled in the kernel, the value -ENODEV is returned.\r\nstruct dentry * securityfs_create_dir (const char * name, struct dentry * parent)¶\r\ncreate a directory in the securityfs filesystem\r\nParameters\r\nconst char * name\r\na pointer to a string containing the name of the directory to create.\r\nstruct dentry * parent\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 152 of 228\n\na pointer to the parent dentry for this file. This should be a directory dentry if set. If this parameter is\r\nNULL , then the directory will be created in the root of the securityfs filesystem.\r\nDescription\r\nThis function creates a directory in securityfs with the given name.\r\nThis function returns a pointer to a dentry if it succeeds. This pointer must be passed to the\r\nsecurityfs_remove() function when the file is to be removed (no automatic cleanup happens if your module is\r\nunloaded, you are responsible here). If an error occurs, the function will return the error value (via ERR_PTR).\r\nIf securityfs is not enabled in the kernel, the value -ENODEV is returned.\r\nvoid securityfs_remove (struct dentry * dentry)¶\r\nremoves a file or directory from the securityfs filesystem\r\nParameters\r\nstruct dentry * dentry\r\na pointer to a the dentry of the file or directory to be removed.\r\nDescription\r\nThis function removes a file or directory in securityfs that was previously created with a call to another securityfs\r\nfunction (like securityfs_create_file() or variants thereof.)\r\nThis function is required to be called in order for the file to be removed. No automatic cleanup of files will happen\r\nwhen a module is removed; you are responsible here.\r\nAudit Interfaces¶\r\nstruct audit_buffer * audit_log_start (struct audit_context * ctx, gfp_t gfp_mask, int type)¶\r\nobtain an audit buffer\r\nParameters\r\nstruct audit_context * ctx\r\naudit_context (may be NULL)\r\ngfp_t gfp_mask\r\ntype of allocation\r\nint type\r\naudit message type\r\nDescription\r\nReturns audit_buffer pointer on success or NULL on error.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 153 of 228\n\nObtain an audit buffer. This routine does locking to obtain the audit buffer, but then no locking is required for calls\r\nto audit_log_*format. If the task (ctx) is a task that is currently in a syscall, then the syscall is marked as auditable\r\nand an audit record will be written at syscall exit. If there is no associated task, then task context (ctx) should be\r\nNULL.\r\nvoid audit_log_format (struct audit_buffer * ab, const char * fmt, ...)¶\r\nformat a message into the audit buffer.\r\nParameters\r\nstruct audit_buffer * ab\r\naudit_buffer\r\nconst char * fmt\r\nformat string\r\n...\r\noptional parameters matching fmt string\r\nDescription\r\nAll the work is done in audit_log_vformat.\r\nvoid audit_log_end (struct audit_buffer * ab)¶\r\nend one audit record\r\nParameters\r\nstruct audit_buffer * ab\r\nthe audit_buffer\r\nDescription\r\nWe can not do a netlink send inside an irq context because it blocks (last arg, flags, is not set to\r\nMSG_DONTWAIT), so the audit buffer is placed on a queue and a tasklet is scheduled to remove them from the\r\nqueue outside the irq context. May be called in any context.\r\nvoid audit_log (struct audit_context * ctx, gfp_t gfp_mask, int type, const char * fmt, ...)¶\r\nLog an audit record\r\nParameters\r\nstruct audit_context * ctx\r\naudit context\r\ngfp_t gfp_mask\r\ntype of allocation\r\nint type\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 154 of 228\n\naudit message type\r\nconst char * fmt\r\nformat string to use\r\n...\r\nvariable parameters matching the format string\r\nDescription\r\nThis is a convenience function that calls audit_log_start, audit_log_vformat, and audit_log_end. It may be called\r\nin any context.\r\nvoid audit_log_secctx (struct audit_buffer * ab, u32 secid)¶\r\nConverts and logs SELinux context\r\nParameters\r\nstruct audit_buffer * ab\r\naudit_buffer\r\nu32 secid\r\nsecurity number\r\nDescription\r\nThis is a helper function that calls security_secid_to_secctx to convert secid to secctx and then adds the\r\n(converted) SELinux context to the audit log by calling audit_log_format, thus also preventing leak of internal\r\nsecid to userspace. If secid cannot be converted audit_panic is called.\r\nint audit_alloc (struct task_struct * tsk)¶\r\nallocate an audit context block for a task\r\nParameters\r\nstruct task_struct * tsk\r\ntask\r\nDescription\r\nFilter on the task information and allocate a per-task audit context if necessary. Doing so turns on system call\r\nauditing for the specified task. This is called from copy_process, so no lock is needed.\r\nvoid __audit_free (struct task_struct * tsk)¶\r\nfree a per-task audit context\r\nParameters\r\nstruct task_struct * tsk\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 155 of 228\n\ntask whose audit context block to free\r\nDescription\r\nCalled from copy_process and do_exit\r\nvoid __audit_syscall_entry (int major, unsigned long a1, unsigned long a2, unsigned long a3, unsigned\r\nlong a4)¶\r\nfill in an audit record at syscall entry\r\nParameters\r\nint major\r\nmajor syscall type (function)\r\nunsigned long a1\r\nadditional syscall register 1\r\nunsigned long a2\r\nadditional syscall register 2\r\nunsigned long a3\r\nadditional syscall register 3\r\nunsigned long a4\r\nadditional syscall register 4\r\nDescription\r\nFill in audit context at syscall entry. This only happens if the audit context was created when the task was created\r\nand the state or filters demand the audit context be built. If the state from the per-task filter or from the per-syscall\r\nfilter is AUDIT_RECORD_CONTEXT, then the record will be written at syscall exit time (otherwise, it will only\r\nbe written if another part of the kernel requests that it be written).\r\nvoid __audit_syscall_exit (int success, long return_code)¶\r\ndeallocate audit context after a system call\r\nParameters\r\nint success\r\nsuccess value of the syscall\r\nlong return_code\r\nreturn value of the syscall\r\nDescription\r\nTear down after system call. If the audit context has been marked as auditable (either because of the\r\nAUDIT_RECORD_CONTEXT state from filtering, or because some other part of the kernel wrote an audit\r\nmessage), then write out the syscall information. In call cases, free the names stored from getname() .\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 156 of 228\n\nstruct filename * __audit_reusename (const __user char * uptr)¶\r\nfill out filename with info from existing entry\r\nParameters\r\nconst __user char * uptr\r\nuserland ptr to pathname\r\nDescription\r\nSearch the audit_names list for the current audit context. If there is an existing entry with a matching “uptr” then\r\nreturn the filename associated with that audit_name. If not, return NULL.\r\nvoid __audit_getname (struct filename * name)¶\r\nadd a name to the list\r\nParameters\r\nstruct filename * name\r\nname to add\r\nDescription\r\nAdd a name to the list of audit names for this context. Called from fs/namei.c: getname() .\r\nvoid __audit_inode (struct filename * name, const struct dentry * dentry, unsigned int flags)¶\r\nstore the inode and device from a lookup\r\nParameters\r\nstruct filename * name\r\nname being audited\r\nconst struct dentry * dentry\r\ndentry being audited\r\nunsigned int flags\r\nattributes for this particular entry\r\nint auditsc_get_stamp (struct audit_context * ctx, struct timespec64 * t, unsigned int * serial)¶\r\nget local copies of audit_context values\r\nParameters\r\nstruct audit_context * ctx\r\naudit_context for the task\r\nstruct timespec64 * t\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 157 of 228\n\ntimespec64 to store time recorded in the audit_context\r\nunsigned int * serial\r\nserial value that is recorded in the audit_context\r\nDescription\r\nAlso sets the context as auditable.\r\nint audit_set_loginuid (kuid_t loginuid)¶\r\nset current task’s audit_context loginuid\r\nParameters\r\nkuid_t loginuid\r\nloginuid value\r\nDescription\r\nReturns 0.\r\nCalled (set) from fs/proc/base.c:: proc_loginuid_write() .\r\nvoid __audit_mq_open (int oflag, umode_t mode, struct mq_attr * attr)¶\r\nrecord audit data for a POSIX MQ open\r\nParameters\r\nint oflag\r\nopen flag\r\numode_t mode\r\nmode bits\r\nstruct mq_attr * attr\r\nqueue attributes\r\nvoid __audit_mq_sendrecv (mqd_t mqdes, size_t msg_len, unsigned int msg_prio, const struct timespec\r\n* abs_timeout)¶\r\nrecord audit data for a POSIX MQ timed send/receive\r\nParameters\r\nmqd_t mqdes\r\nMQ descriptor\r\nsize_t msg_len\r\nMessage length\r\nunsigned int msg_prio\r\nMessage priority\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 158 of 228\n\nconst struct timespec * abs_timeout\r\nMessage timeout in absolute time\r\nvoid __audit_mq_notify (mqd_t mqdes, const struct sigevent * notification)¶\r\nrecord audit data for a POSIX MQ notify\r\nParameters\r\nmqd_t mqdes\r\nMQ descriptor\r\nconst struct sigevent * notification\r\nNotification event\r\nvoid __audit_mq_getsetattr (mqd_t mqdes, struct mq_attr * mqstat)¶\r\nrecord audit data for a POSIX MQ get/set attribute\r\nParameters\r\nmqd_t mqdes\r\nMQ descriptor\r\nstruct mq_attr * mqstat\r\nMQ flags\r\nvoid __audit_ipc_obj (struct kern_ipc_perm * ipcp)¶\r\nrecord audit data for ipc object\r\nParameters\r\nstruct kern_ipc_perm * ipcp\r\nipc permissions\r\nvoid __audit_ipc_set_perm (unsigned long qbytes, uid_t uid, gid_t gid, umode_t mode)¶\r\nrecord audit data for new ipc permissions\r\nParameters\r\nunsigned long qbytes\r\nmsgq bytes\r\nuid_t uid\r\nmsgq user id\r\ngid_t gid\r\nmsgq group id\r\numode_t mode\r\nmsgq mode (permissions)\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 159 of 228\n\nDescription\r\nCalled only after audit_ipc_obj() .\r\nint __audit_socketcall (int nargs, unsigned long * args)¶\r\nrecord audit data for sys_socketcall\r\nParameters\r\nint nargs\r\nnumber of args, which should not be more than AUDITSC_ARGS.\r\nunsigned long * args\r\nargs array\r\nvoid __audit_fd_pair (int fd1, int fd2)¶\r\nrecord audit data for pipe and socketpair\r\nParameters\r\nint fd1\r\nthe first file descriptor\r\nint fd2\r\nthe second file descriptor\r\nint __audit_sockaddr (int len, void * a)¶\r\nrecord audit data for sys_bind, sys_connect, sys_sendto\r\nParameters\r\nint len\r\ndata length in user space\r\nvoid * a\r\ndata address in kernel space\r\nDescription\r\nReturns 0 for success or NULL context or \u003c 0 on error.\r\nint audit_signal_info (int sig, struct task_struct * t)¶\r\nrecord signal info for shutting down audit subsystem\r\nParameters\r\nint sig\r\nsignal value\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 160 of 228\n\nstruct task_struct * t\r\ntask being signaled\r\nDescription\r\nIf the audit subsystem is being terminated, record the task (pid) and uid that is doing that.\r\nint __audit_log_bprm_fcaps (struct linux_binprm * bprm, const struct cred * new, const struct cred * old)¶\r\nstore information about a loading bprm and relevant fcaps\r\nParameters\r\nstruct linux_binprm * bprm\r\npointer to the bprm being processed\r\nconst struct cred * new\r\nthe proposed new credentials\r\nconst struct cred * old\r\nthe old credentials\r\nDescription\r\nSimply check if the proc already has the caps given by the file and if not store the priv escalation info for later\r\nauditing at the end of the syscall\r\n-Eric\r\nvoid __audit_log_capset (const struct cred * new, const struct cred * old)¶\r\nstore information about the arguments to the capset syscall\r\nParameters\r\nconst struct cred * new\r\nthe new credentials\r\nconst struct cred * old\r\nthe old (current) credentials\r\nDescription\r\nRecord the arguments userspace sent to sys_capset for later printing by the audit system if applicable\r\nvoid audit_core_dumps (long signr)¶\r\nrecord information about processes that end abnormally\r\nParameters\r\nlong signr\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 161 of 228\n\nsignal value\r\nDescription\r\nIf a process ends with a core dump, something fishy is going on and we should record the event for investigation.\r\nint audit_rule_change (int type, int seq, void * data, size_t datasz)¶\r\napply all rules to the specified message type\r\nParameters\r\nint type\r\naudit message type\r\nint seq\r\nnetlink audit message sequence (serial) number\r\nvoid * data\r\npayload data\r\nsize_t datasz\r\nsize of payload data\r\nint audit_list_rules_send (struct sk_buff * request_skb, int seq)¶\r\nlist the audit rules\r\nParameters\r\nstruct sk_buff * request_skb\r\nskb of request we are replying to (used to target the reply)\r\nint seq\r\nnetlink audit message sequence (serial) number\r\nint parent_len (const char * path)¶\r\nfind the length of the parent portion of a pathname\r\nParameters\r\nconst char * path\r\npathname of which to determine length\r\nint audit_compare_dname_path (const char * dname, const char * path, int parentlen)¶\r\ncompare given dentry name with last component in given path. Return of 0 indicates a match.\r\nParameters\r\nconst char * dname\r\ndentry name that we’re comparing\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 162 of 228\n\nconst char * path\r\nfull pathname that we’re comparing\r\nint parentlen\r\nlength of the parent if known. Passing in AUDIT_NAME_FULL here indicates that we must compute this\r\nvalue.\r\nAccounting Framework¶\r\nlong sys_acct (const char __user * name)¶\r\nenable/disable process accounting\r\nParameters\r\nconst char __user * name\r\nfile name for accounting records or NULL to shutdown accounting\r\nDescription\r\nReturns 0 for success or negative errno values for failure.\r\nsys_acct() is the only system call needed to implement process accounting. It takes the name of the file where\r\naccounting records should be written. If the filename is NULL, accounting will be shutdown.\r\nvoid acct_collect (long exitcode, int group_dead)¶\r\ncollect accounting information into pacct_struct\r\nParameters\r\nlong exitcode\r\ntask exit code\r\nint group_dead\r\nnot 0, if this thread is the last one in the process.\r\nvoid acct_process (void)¶\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nhandles process accounting for an exiting task\r\nBlock Devices¶\r\nvoid blk_delay_queue (struct request_queue * q, unsigned long msecs)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 163 of 228\n\nrestart queueing after defined interval\r\nParameters\r\nstruct request_queue * q\r\nThe struct request_queue in question\r\nunsigned long msecs\r\nDelay in msecs\r\nDescription\r\nSometimes queueing needs to be postponed for a little while, to allow resources to come back. This\r\nfunction will make sure that queueing is restarted around the specified time. Queue lock must be held.\r\nvoid blk_start_queue_async (struct request_queue * q)¶\r\nasynchronously restart a previously stopped queue\r\nParameters\r\nstruct request_queue * q\r\nThe struct request_queue in question\r\nDescription\r\nblk_start_queue_async() will clear the stop flag on the queue, and ensure that the request_fn for the\r\nqueue is run from an async context.\r\nvoid blk_start_queue (struct request_queue * q)¶\r\nrestart a previously stopped queue\r\nParameters\r\nstruct request_queue * q\r\nThe struct request_queue in question\r\nDescription\r\nblk_start_queue() will clear the stop flag on the queue, and call the request_fn for the queue if it was\r\nin a stopped state when entered. Also see blk_stop_queue() . Queue lock must be held.\r\nvoid blk_stop_queue (struct request_queue * q)¶\r\nstop a queue\r\nParameters\r\nstruct request_queue * q\r\nThe struct request_queue in question\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 164 of 228\n\nDescription\r\nThe Linux block layer assumes that a block driver will consume all entries on the request queue when\r\nthe request_fn strategy is called. Often this will not happen, because of hardware limitations (queue\r\ndepth settings). If a device driver gets a ‘queue full’ response, or if it simply chooses not to queue more\r\nI/O at one point, it can call this function to prevent the request_fn from being called until the driver has\r\nsignalled it’s ready to go again. This happens by calling blk_start_queue() to restart queue\r\noperations. Queue lock must be held.\r\nvoid blk_sync_queue (struct request_queue * q)¶\r\ncancel any pending callbacks on a queue\r\nParameters\r\nstruct request_queue * q\r\nthe queue\r\nDescription\r\nThe block layer may perform asynchronous callback activity on a queue, such as calling the unplug\r\nfunction after a timeout. A block device may call blk_sync_queue to ensure that any such activity is\r\ncancelled, thus allowing it to release resources that the callbacks might use. The caller must already\r\nhave made sure that its -\u003emake_request_fn will not re-add plugging prior to calling this function.\r\nThis function does not cancel any asynchronous activity arising out of elevator or throttling code. That\r\nwould require elevator_exit() and blkcg_exit_queue() to be called with queue lock initialized.\r\nvoid __blk_run_queue_uncond (struct request_queue * q)¶\r\nrun a queue whether or not it has been stopped\r\nParameters\r\nstruct request_queue * q\r\nThe queue to run\r\nDescription\r\nInvoke request handling on a queue if there are any pending requests. May be used to restart request\r\nhandling after a request has completed. This variant runs the queue whether or not the queue has been\r\nstopped. Must be called with the queue lock held and interrupts disabled. See also blk_run_queue.\r\nvoid __blk_run_queue (struct request_queue * q)¶\r\nrun a single device queue\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 165 of 228\n\nstruct request_queue * q\r\nThe queue to run\r\nDescription\r\nSee blk_run_queue. This variant must be called with the queue lock held and interrupts disabled.\r\nvoid blk_run_queue_async (struct request_queue * q)¶\r\nrun a single device queue in workqueue context\r\nParameters\r\nstruct request_queue * q\r\nThe queue to run\r\nDescription\r\nTells kblockd to perform the equivalent of blk_run_queue on behalf of us. The caller must hold the\r\nqueue lock.\r\nvoid blk_run_queue (struct request_queue * q)¶\r\nrun a single device queue\r\nParameters\r\nstruct request_queue * q\r\nThe queue to run\r\nDescription\r\nInvoke request handling on this queue, if it has pending work to do. May be used to restart queueing\r\nwhen a request has completed.\r\nvoid blk_queue_bypass_start (struct request_queue * q)¶\r\nenter queue bypass mode\r\nParameters\r\nstruct request_queue * q\r\nqueue of interest\r\nDescription\r\nIn bypass mode, only the dispatch FIFO queue of q is used. This function makes q enter bypass mode and drains\r\nall requests which were throttled or issued before. On return, it’s guaranteed that no request is being throttled or\r\nhas ELVPRIV set and blk_queue_bypass() true inside queue or RCU read lock.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 166 of 228\n\nvoid blk_queue_bypass_end (struct request_queue * q)¶\r\nleave queue bypass mode\r\nParameters\r\nstruct request_queue * q\r\nqueue of interest\r\nDescription\r\nLeave bypass mode and restore the normal queueing behavior.\r\nvoid blk_cleanup_queue (struct request_queue * q)¶\r\nshutdown a request queue\r\nParameters\r\nstruct request_queue * q\r\nrequest queue to shutdown\r\nDescription\r\nMark q DYING, drain all pending requests, mark q DEAD, destroy and put it. All future requests will be failed\r\nimmediately with -ENODEV.\r\nstruct request_queue * blk_init_queue (request_fn_proc * rfn, spinlock_t * lock)¶\r\nprepare a request queue for use with a block device\r\nParameters\r\nrequest_fn_proc * rfn\r\nThe function to be called to process requests that have been placed on the queue.\r\nspinlock_t * lock\r\nRequest queue spin lock\r\nDescription\r\nIf a block device wishes to use the standard request handling procedures, which sorts requests and\r\ncoalesces adjacent requests, then it must call blk_init_queue() . The function rfn will be called when\r\nthere are requests on the queue that need to be processed. If the device supports plugging, then rfn may\r\nnot be called immediately when requests are available on the queue, but may be called at some time\r\nlater instead. Plugged queues are generally unplugged when a buffer belonging to one of the requests on\r\nthe queue is needed, or due to memory pressure.\r\nrfn is not required, or even expected, to remove all requests off the queue, but only as many as it can\r\nhandle at a time. If it does leave requests on the queue, it is responsible for arranging that the requests\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 167 of 228\n\nget dealt with eventually.\r\nThe queue spin lock must be held while manipulating the requests on the request queue; this lock will\r\nbe taken also from interrupt context, so irq disabling is needed for it.\r\nFunction returns a pointer to the initialized request queue, or NULL if it didn’t succeed.\r\nNote\r\nblk_init_queue() must be paired with a blk_cleanup_queue() call when the block device is\r\ndeactivated (such as at module unload).\r\nvoid blk_requeue_request (struct request_queue * q, struct request * rq)¶\r\nput a request back on queue\r\nParameters\r\nstruct request_queue * q\r\nrequest queue where request should be inserted\r\nstruct request * rq\r\nrequest to be inserted\r\nDescription\r\nDrivers often keep queueing requests until the hardware cannot accept more, when that condition\r\nhappens we need to put the request back on the queue. Must be called with queue lock held.\r\nvoid part_round_stats (int cpu, struct hd_struct * part)¶\r\nRound off the performance stats on a struct disk_stats.\r\nParameters\r\nint cpu\r\ncpu number for stats access\r\nstruct hd_struct * part\r\ntarget partition\r\nDescription\r\nThe average IO queue length and utilisation statistics are maintained by observing the current state of the queue\r\nlength and the amount of time it has been in this state for.\r\nNormally, that accounting is done on IO completion, but that can result in more than a second’s worth of IO being\r\naccounted for within any one second, leading to \u003e100% utilisation. To deal with that, we call this function to do a\r\nround-off before returning the results when reading /proc/diskstats. This accounts immediately for all queue usage\r\nup to the current jiffies and restarts the counters again.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 168 of 228\n\nblk_qc_t generic_make_request (struct bio * bio)¶\r\nhand a buffer to its device driver for I/O\r\nParameters\r\nstruct bio * bio\r\nThe bio describing the location in memory and on the device.\r\nDescription\r\ngeneric_make_request() is used to make I/O requests of block devices. It is passed a struct bio , which\r\ndescribes the I/O that needs to be done.\r\ngeneric_make_request() does not return any status. The success/failure status of the request, along with\r\nnotification of completion, is delivered asynchronously through the bio-\u003ebi_end_io function described (one day)\r\nelse where.\r\nThe caller of generic_make_request must make sure that bi_io_vec are set to describe the memory buffer, and that\r\nbi_dev and bi_sector are set to describe the device address, and the bi_end_io and optionally bi_private are set to\r\ndescribe how completion notification should be signaled.\r\ngeneric_make_request and the drivers it calls may use bi_next if this bio happens to be merged with someone else,\r\nand may resubmit the bio to a lower device by calling into generic_make_request recursively, which means the bio\r\nshould NOT be touched after the call to -\u003emake_request_fn.\r\nblk_qc_t submit_bio (struct bio * bio)¶\r\nsubmit a bio to the block device layer for I/O\r\nParameters\r\nstruct bio * bio\r\nThe struct bio which describes the I/O\r\nDescription\r\nsubmit_bio() is very similar in purpose to generic_make_request() , and uses that function to do most of the\r\nwork. Both are fairly rough interfaces; bio must be presetup and ready for I/O.\r\nint blk_insert_cloned_request (struct request_queue * q, struct request * rq)¶\r\nHelper for stacking drivers to submit a request\r\nParameters\r\nstruct request_queue * q\r\nthe queue to submit the request\r\nstruct request * rq\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 169 of 228\n\nthe request being queued\r\nunsigned int blk_rq_err_bytes (const struct request * rq)¶\r\ndetermine number of bytes till the next failure boundary\r\nParameters\r\nconst struct request * rq\r\nrequest to examine\r\nDescription\r\nA request could be merge of IOs which require different failure handling. This function determines the\r\nnumber of bytes which can be failed from the beginning of the request without crossing into area which\r\nneed to be retried further.\r\nReturn\r\nThe number of bytes to fail.\r\nContext\r\nqueue_lock must be held.\r\nstruct request * blk_peek_request (struct request_queue * q)¶\r\npeek at the top of a request queue\r\nParameters\r\nstruct request_queue * q\r\nrequest queue to peek at\r\nDescription\r\nReturn the request at the top of q. The returned request should be started using blk_start_request()\r\nbefore LLD starts processing it.\r\nReturn\r\nPointer to the request at the top of q if available. Null otherwise.\r\nContext\r\nqueue_lock must be held.\r\nvoid blk_start_request (struct request * req)¶\r\nstart request processing on the driver\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 170 of 228\n\nParameters\r\nstruct request * req\r\nrequest to dequeue\r\nDescription\r\nDequeue req and start timeout timer on it. This hands off the request to the driver.\r\nBlock internal functions which don’t want to start timer should call blk_dequeue_request() .\r\nContext\r\nqueue_lock must be held.\r\nstruct request * blk_fetch_request (struct request_queue * q)¶\r\nfetch a request from a request queue\r\nParameters\r\nstruct request_queue * q\r\nrequest queue to fetch a request from\r\nDescription\r\nReturn the request at the top of q. The request is started on return and LLD can start processing it\r\nimmediately.\r\nReturn\r\nPointer to the request at the top of q if available. Null otherwise.\r\nContext\r\nqueue_lock must be held.\r\nbool blk_update_request (struct request * req, int error, unsigned int nr_bytes)¶\r\nSpecial helper function for request stacking drivers\r\nParameters\r\nstruct request * req\r\nthe request being processed\r\nint error\r\n0 for success, \u003c 0 for error\r\nunsigned int nr_bytes\r\nnumber of bytes to complete req\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 171 of 228\n\nDescription\r\nEnds I/O on a number of bytes attached to req, but doesn’t complete the request structure even if req\r\ndoesn’t have leftover. If req has leftover, sets it up for the next range of segments.\r\nThis special helper function is only for request stacking drivers (e.g. request-based dm) so that they can\r\nhandle partial completion. Actual device drivers should use blk_end_request instead.\r\nPassing the result of blk_rq_bytes() as nr_bytes guarantees false return from this function.\r\nReturn\r\nfalse - this request doesn’t have any more data true - this request has more data\r\nvoid blk_unprep_request (struct request * req)¶\r\nunprepare a request\r\nParameters\r\nstruct request * req\r\nthe request\r\nDescription\r\nThis function makes a request ready for complete resubmission (or completion). It happens only after all error\r\nhandling is complete, so represents the appropriate moment to deallocate any resources that were allocated to the\r\nrequest in the prep_rq_fn. The queue lock is held when calling this.\r\nbool blk_end_request (struct request * rq, int error, unsigned int nr_bytes)¶\r\nHelper function for drivers to complete the request.\r\nParameters\r\nstruct request * rq\r\nthe request being processed\r\nint error\r\n0 for success, \u003c 0 for error\r\nunsigned int nr_bytes\r\nnumber of bytes to complete\r\nDescription\r\nEnds I/O on a number of bytes attached to rq. If rq has leftover, sets it up for the next range of\r\nsegments.\r\nReturn\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 172 of 228\n\nfalse - we are done with this request true - still buffers pending for this request\r\nvoid blk_end_request_all (struct request * rq, int error)¶\r\nHelper function for drives to finish the request.\r\nParameters\r\nstruct request * rq\r\nthe request to finish\r\nint error\r\n0 for success, \u003c 0 for error\r\nDescription\r\nCompletely finish rq.\r\nbool __blk_end_request (struct request * rq, int error, unsigned int nr_bytes)¶\r\nHelper function for drivers to complete the request.\r\nParameters\r\nstruct request * rq\r\nthe request being processed\r\nint error\r\n0 for success, \u003c 0 for error\r\nunsigned int nr_bytes\r\nnumber of bytes to complete\r\nDescription\r\nMust be called with queue lock held unlike blk_end_request() .\r\nReturn\r\nfalse - we are done with this request true - still buffers pending for this request\r\nvoid __blk_end_request_all (struct request * rq, int error)¶\r\nHelper function for drives to finish the request.\r\nParameters\r\nstruct request * rq\r\nthe request to finish\r\nint error\r\n0 for success, \u003c 0 for error\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 173 of 228\n\nDescription\r\nCompletely finish rq. Must be called with queue lock held.\r\nbool __blk_end_request_cur (struct request * rq, int error)¶\r\nHelper function to finish the current request chunk.\r\nParameters\r\nstruct request * rq\r\nthe request to finish the current chunk for\r\nint error\r\n0 for success, \u003c 0 for error\r\nDescription\r\nComplete the current consecutively mapped chunk from rq. Must be called with queue lock held.\r\nReturn\r\nfalse - we are done with this request true - still buffers pending for this request\r\nvoid rq_flush_dcache_pages (struct request * rq)¶\r\nHelper function to flush all pages in a request\r\nParameters\r\nstruct request * rq\r\nthe request to be flushed\r\nDescription\r\nFlush all pages in rq.\r\nint blk_lld_busy (struct request_queue * q)¶\r\nCheck if underlying low-level drivers of a device are busy\r\nParameters\r\nstruct request_queue * q\r\nthe queue of the device being checked\r\nDescription\r\nCheck if underlying low-level drivers of a device are busy. If the drivers want to export their busy state,\r\nthey must set own exporting function using blk_queue_lld_busy() first.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 174 of 228\n\nBasically, this function is used only by request stacking drivers to stop dispatching requests to\r\nunderlying devices when underlying devices are busy. This behavior helps more I/O merging on the\r\nqueue of the request stacking driver and prevents I/O throughput regression on burst I/O load.\r\nReturn\r\n0 - Not busy (The request stacking driver should dispatch request) 1 - Busy (The request stacking driver\r\nshould stop dispatching request)\r\nvoid blk_rq_unprep_clone (struct request * rq)¶\r\nHelper function to free all bios in a cloned request\r\nParameters\r\nstruct request * rq\r\nthe clone request to be cleaned up\r\nDescription\r\nFree all bios in rq for a cloned request.\r\nint blk_rq_prep_clone (struct request * rq, struct request * rq_src, struct bio_set * bs, gfp_t gfp_mask, int\r\n(*bio_ctr) (struct bio *, struct bio *, void *, void * data)¶\r\nHelper function to setup clone request\r\nParameters\r\nstruct request * rq\r\nthe request to be setup\r\nstruct request * rq_src\r\noriginal request to be cloned\r\nstruct bio_set * bs\r\nbio_set that bios for clone are allocated from\r\ngfp_t gfp_mask\r\nmemory allocation mask for bio\r\nint (*)(struct bio *, struct bio *, void *) bio_ctr\r\nsetup function to be called for each clone bio. Returns 0 for success, non 0 for failure.\r\nvoid * data\r\nprivate data to be passed to bio_ctr\r\nDescription\r\nClones bios in rq_src to rq, and copies attributes of rq_src to rq. The actual data parts of rq_src (e.g. -\r\n\u003ecmd, -\u003esense) are not copied, and copying such parts is the caller’s responsibility. Also, pages which\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 175 of 228\n\nthe original bios are pointing to are not copied and the cloned bios just point same pages. So cloned bios\r\nmust be completed before original bios, which means the caller must complete rq before rq_src.\r\nvoid blk_start_plug (struct blk_plug * plug)¶\r\ninitialize blk_plug and track it inside the task_struct\r\nParameters\r\nstruct blk_plug * plug\r\nThe struct blk_plug that needs to be initialized\r\nDescription\r\nTracking blk_plug inside the task_struct will help with auto-flushing the pending I/O should the task\r\nend up blocking between blk_start_plug() and blk_finish_plug() . This is important from a\r\nperformance perspective, but also ensures that we don’t deadlock. For instance, if the task is blocking\r\nfor a memory allocation, memory reclaim could end up wanting to free a page belonging to that request\r\nthat is currently residing in our private plug. By flushing the pending I/O when the process goes to\r\nsleep, we avoid this kind of deadlock.\r\nvoid blk_pm_runtime_init (struct request_queue * q, struct device * dev)¶\r\nBlock layer runtime PM initialization routine\r\nParameters\r\nstruct request_queue * q\r\nthe queue of the device\r\nstruct device * dev\r\nthe device the queue belongs to\r\nDescription\r\nInitialize runtime-PM-related fields for q and start auto suspend for dev. Drivers that want to take\r\nadvantage of request-based runtime PM should call this function after dev has been initialized, and its\r\nrequest queue q has been allocated, and runtime PM for it can not happen yet(either due to\r\ndisabled/forbidden or its usage_count \u003e 0). In most cases, driver should call this function before any I/O\r\nhas taken place.\r\nThis function takes care of setting up using auto suspend for the device, the autosuspend delay is set to\r\n-1 to make runtime suspend impossible until an updated value is either set by user or by driver. Drivers\r\ndo not need to touch other autosuspend settings.\r\nThe block layer runtime PM is request based, so only works for drivers that use request as their IO unit\r\ninstead of those directly use bio’s.\r\nint blk_pre_runtime_suspend (struct request_queue * q)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 176 of 228\n\nPre runtime suspend check\r\nParameters\r\nstruct request_queue * q\r\nthe queue of the device\r\nDescription\r\nThis function will check if runtime suspend is allowed for the device by examining if there are any\r\nrequests pending in the queue. If there are requests pending, the device can not be runtime suspended;\r\notherwise, the queue’s status will be updated to SUSPENDING and the driver can proceed to suspend\r\nthe device.\r\nFor the not allowed case, we mark last busy for the device so that runtime PM core will try to\r\nautosuspend it some time later.\r\nThis function should be called near the start of the device’s runtime_suspend callback.\r\nReturn\r\n0 - OK to runtime suspend the device -EBUSY - Device should not be runtime suspended\r\nvoid blk_post_runtime_suspend (struct request_queue * q, int err)¶\r\nPost runtime suspend processing\r\nParameters\r\nstruct request_queue * q\r\nthe queue of the device\r\nint err\r\nreturn value of the device’s runtime_suspend function\r\nDescription\r\nUpdate the queue’s runtime status according to the return value of the device’s runtime suspend\r\nfunction and mark last busy for the device so that PM core will try to auto suspend the device at a later\r\ntime.\r\nThis function should be called near the end of the device’s runtime_suspend callback.\r\nvoid blk_pre_runtime_resume (struct request_queue * q)¶\r\nPre runtime resume processing\r\nParameters\r\nstruct request_queue * q\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 177 of 228\n\nthe queue of the device\r\nDescription\r\nUpdate the queue’s runtime status to RESUMING in preparation for the runtime resume of the device.\r\nThis function should be called near the start of the device’s runtime_resume callback.\r\nvoid blk_post_runtime_resume (struct request_queue * q, int err)¶\r\nPost runtime resume processing\r\nParameters\r\nstruct request_queue * q\r\nthe queue of the device\r\nint err\r\nreturn value of the device’s runtime_resume function\r\nDescription\r\nUpdate the queue’s runtime status according to the return value of the device’s runtime_resume\r\nfunction. If it is successfully resumed, process the requests that are queued into the device’s queue when\r\nit is resuming and then mark last busy and initiate autosuspend for it.\r\nThis function should be called near the end of the device’s runtime_resume callback.\r\nvoid blk_set_runtime_active (struct request_queue * q)¶\r\nForce runtime status of the queue to be active\r\nParameters\r\nstruct request_queue * q\r\nthe queue of the device\r\nDescription\r\nIf the device is left runtime suspended during system suspend the resume hook typically resumes the device and\r\ncorrects runtime status accordingly. However, that does not affect the queue runtime PM status which is still\r\n“suspended”. This prevents processing requests from the queue.\r\nThis function can be used in driver’s resume hook to correct queue runtime PM status and re-enable peeking\r\nrequests from the queue. It should be called before first request is added to the queue.\r\nvoid __blk_drain_queue (struct request_queue * q, bool drain_all)¶\r\ndrain requests from request_queue\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 178 of 228\n\nstruct request_queue * q\r\nqueue to drain\r\nbool drain_all\r\nwhether to drain all requests or only the ones w/ ELVPRIV\r\nDescription\r\nDrain requests from q. If drain_all is set, all requests are drained. If not, only ELVPRIV requests are drained. The\r\ncaller is responsible for ensuring that no new requests which need to be drained are queued.\r\nstruct request * __get_request (struct request_list * rl, unsigned int op, struct bio * bio, gfp_t gfp_mask)¶\r\nget a free request\r\nParameters\r\nstruct request_list * rl\r\nrequest list to allocate from\r\nunsigned int op\r\noperation and flags\r\nstruct bio * bio\r\nbio to allocate request for (can be NULL )\r\ngfp_t gfp_mask\r\nallocation mask\r\nDescription\r\nGet a free request from q. This function may fail under memory pressure or if q is dead.\r\nMust be called with q-\u003equeue_lock held and, Returns ERR_PTR on failure, with q-\u003equeue_lock held. Returns\r\nrequest pointer on success, with q-\u003equeue_lock not held.\r\nstruct request * get_request (struct request_queue * q, unsigned int op, struct bio * bio, gfp_t gfp_mask)¶\r\nget a free request\r\nParameters\r\nstruct request_queue * q\r\nrequest_queue to allocate request from\r\nunsigned int op\r\noperation and flags\r\nstruct bio * bio\r\nbio to allocate request for (can be NULL )\r\ngfp_t gfp_mask\r\nallocation mask\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 179 of 228\n\nDescription\r\nGet a free request from q. If __GFP_DIRECT_RECLAIM is set in gfp_mask, this function keeps retrying under\r\nmemory pressure and fails iff q is dead.\r\nMust be called with q-\u003equeue_lock held and, Returns ERR_PTR on failure, with q-\u003equeue_lock held. Returns\r\nrequest pointer on success, with q-\u003equeue_lock not held.\r\nbool blk_attempt_plug_merge (struct request_queue * q, struct bio * bio, unsigned int * request_count, struct\r\nrequest ** same_queue_rq)¶\r\ntry to merge with current ‘s plugged list\r\nParameters\r\nstruct request_queue * q\r\nrequest_queue new bio is being queued at\r\nstruct bio * bio\r\nnew bio being queued\r\nunsigned int * request_count\r\nout parameter for number of traversed plugged requests\r\nstruct request ** same_queue_rq\r\npointer to struct request that gets filled in when another request associated with q is found on the plug\r\nlist (optional, may be NULL )\r\nDescription\r\nDetermine whether bio being queued on q can be merged with a request on current ‘s plugged list. Returns\r\ntrue if merge was successful, otherwise false .\r\nPlugging coalesces IOs from the same issuer for the same purpose without going through q-\u003equeue_lock. As such\r\nit’s more of an issuing mechanism than scheduling, and the request, while may have elvpriv data, is not added on\r\nthe elevator at this point. In addition, we don’t have reliable access to the elevator outside queue lock. Only check\r\nbasic merging parameters without querying the elevator.\r\nCaller must ensure !blk_queue_nomerges(q) beforehand.\r\nint blk_cloned_rq_check_limits (struct request_queue * q, struct request * rq)¶\r\nHelper function to check a cloned request for new the queue limits\r\nParameters\r\nstruct request_queue * q\r\nthe queue\r\nstruct request * rq\r\nthe request being checked\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 180 of 228\n\nDescription\r\nrq may have been made based on weaker limitations of upper-level queues in request stacking drivers,\r\nand it may violate the limitation of q. Since the block layer and the underlying device driver trust rq\r\nafter it is inserted to q, it should be checked against q before the insertion using this generic function.\r\nRequest stacking drivers like request-based dm may change the queue limits when retrying requests on\r\nother queues. Those requests need to be checked against the new queue limits again during dispatch.\r\nbool blk_end_bidi_request (struct request * rq, int error, unsigned int nr_bytes, unsigned int bidi_bytes)¶\r\nComplete a bidi request\r\nParameters\r\nstruct request * rq\r\nthe request to complete\r\nint error\r\n0 for success, \u003c 0 for error\r\nunsigned int nr_bytes\r\nnumber of bytes to complete rq\r\nunsigned int bidi_bytes\r\nnumber of bytes to complete rq-\u003enext_rq\r\nDescription\r\nEnds I/O on a number of bytes attached to rq and rq-\u003enext_rq. Drivers that supports bidi can safely call\r\nthis member for any type of request, bidi or uni. In the later case bidi_bytes is just ignored.\r\nReturn\r\nfalse - we are done with this request true - still buffers pending for this request\r\nbool __blk_end_bidi_request (struct request * rq, int error, unsigned int nr_bytes, unsigned int bidi_bytes)¶\r\nComplete a bidi request with queue lock held\r\nParameters\r\nstruct request * rq\r\nthe request to complete\r\nint error\r\n0 for success, \u003c 0 for error\r\nunsigned int nr_bytes\r\nnumber of bytes to complete rq\r\nunsigned int bidi_bytes\r\nnumber of bytes to complete rq-\u003enext_rq\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 181 of 228\n\nDescription\r\nIdentical to blk_end_bidi_request() except that queue lock is assumed to be locked on entry and\r\nremains so on return.\r\nReturn\r\nfalse - we are done with this request true - still buffers pending for this request\r\nint blk_rq_map_user_iov (struct request_queue * q, struct request * rq, struct rq_map_data * map_data, const\r\nstruct iov_iter * iter, gfp_t gfp_mask)¶\r\nmap user data to a request, for passthrough requests\r\nParameters\r\nstruct request_queue * q\r\nrequest queue where request should be inserted\r\nstruct request * rq\r\nrequest to map data to\r\nstruct rq_map_data * map_data\r\npointer to the rq_map_data holding pages (if necessary)\r\nconst struct iov_iter * iter\r\niovec iterator\r\ngfp_t gfp_mask\r\nmemory allocation flags\r\nDescription\r\nData will be mapped directly for zero copy I/O, if possible. Otherwise a kernel bounce buffer is used.\r\nA matching blk_rq_unmap_user() must be issued at the end of I/O, while still in process context.\r\nNote\r\nThe mapped bio may need to be bounced through blk_queue_bounce()\r\nbefore being submitted to the device, as pages mapped may be out of reach. It’s the callers responsibility to\r\nmake sure this happens. The original bio must be passed back in to blk_rq_unmap_user() for proper\r\nunmapping.\r\nint blk_rq_unmap_user (struct bio * bio)¶\r\nunmap a request with user data\r\nParameters\r\nstruct bio * bio\r\nstart of bio list\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 182 of 228\n\nDescription\r\nUnmap a rq previously mapped by blk_rq_map_user() . The caller must supply the original rq-\u003ebio\r\nfrom the blk_rq_map_user() return, since the I/O completion may have changed rq-\u003ebio.\r\nint blk_rq_map_kern (struct request_queue * q, struct request * rq, void * kbuf, unsigned int len,\r\ngfp_t gfp_mask)¶\r\nmap kernel data to a request, for passthrough requests\r\nParameters\r\nstruct request_queue * q\r\nrequest queue where request should be inserted\r\nstruct request * rq\r\nrequest to fill\r\nvoid * kbuf\r\nthe kernel buffer\r\nunsigned int len\r\nlength of user data\r\ngfp_t gfp_mask\r\nmemory allocation flags\r\nDescription\r\nData will be mapped directly if possible. Otherwise a bounce buffer is used. Can be called multiple\r\ntimes to append multiple buffers.\r\nvoid __blk_release_queue (struct work_struct * work)¶\r\nrelease a request queue when it is no longer needed\r\nParameters\r\nstruct work_struct * work\r\npointer to the release_work member of the request queue to be released\r\nDescription\r\nblk_release_queue is the counterpart of blk_init_queue() . It should be called when a request queue is\r\nbeing released; typically when a block device is being de-registered. Its primary task it to free the queue\r\nitself.\r\nNotes\r\nThe low level driver must have finished any outstanding requests first via blk_cleanup_queue() .\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 183 of 228\n\nAlthough blk_release_queue() may be called with preemption disabled, __blk_release_queue()\r\nmay sleep.\r\nvoid blk_queue_prep_rq (struct request_queue * q, prep_rq_fn * pfn)¶\r\nset a prepare_request function for queue\r\nParameters\r\nstruct request_queue * q\r\nqueue\r\nprep_rq_fn * pfn\r\nprepare_request function\r\nDescription\r\nIt’s possible for a queue to register a prepare_request callback which is invoked before the request is handed to the\r\nrequest_fn. The goal of the function is to prepare a request for I/O, it can be used to build a cdb from the request\r\ndata for instance.\r\nvoid blk_queue_unprep_rq (struct request_queue * q, unprep_rq_fn * ufn)¶\r\nset an unprepare_request function for queue\r\nParameters\r\nstruct request_queue * q\r\nqueue\r\nunprep_rq_fn * ufn\r\nunprepare_request function\r\nDescription\r\nIt’s possible for a queue to register an unprepare_request callback which is invoked before the request is finally\r\ncompleted. The goal of the function is to deallocate any data that was allocated in the prepare_request callback.\r\nvoid blk_set_default_limits (struct queue_limits * lim)¶\r\nreset limits to default values\r\nParameters\r\nstruct queue_limits * lim\r\nthe queue_limits structure to reset\r\nDescription\r\nReturns a queue_limit struct to its default state.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 184 of 228\n\nvoid blk_set_stacking_limits (struct queue_limits * lim)¶\r\nset default limits for stacking devices\r\nParameters\r\nstruct queue_limits * lim\r\nthe queue_limits structure to reset\r\nDescription\r\nReturns a queue_limit struct to its default state. Should be used by stacking drivers like DM that have\r\nno internal limits.\r\nvoid blk_queue_make_request (struct request_queue * q, make_request_fn * mfn)¶\r\ndefine an alternate make_request function for a device\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device to be affected\r\nmake_request_fn * mfn\r\nthe alternate make_request function\r\nDescription\r\nThe normal way for struct bios to be passed to a device driver is for them to be collected into\r\nrequests on a request queue, and then to allow the device driver to select requests off that queue when it\r\nis ready. This works well for many block devices. However some block devices (typically virtual\r\ndevices such as md or lvm) do not benefit from the processing on the request queue, and are served best\r\nby having the requests passed directly to them. This can be achieved by providing a function to\r\nblk_queue_make_request() .\r\nCaveat:\r\nThe driver that does this must be able to deal appropriately with buffers in “highmemory”. This can be\r\naccomplished by either calling __bio_kmap_atomic() to get a temporary kernel mapping, or by calling\r\nblk_queue_bounce() to create a buffer in normal memory.\r\nvoid blk_queue_bounce_limit (struct request_queue * q, u64 max_addr)¶\r\nset bounce buffer limit for queue\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nu64 max_addr\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 185 of 228\n\nthe maximum address the device can handle\r\nDescription\r\nDifferent hardware can have different requirements as to what pages it can do I/O directly to. A low\r\nlevel driver can call blk_queue_bounce_limit to have lower memory pages allocated as bounce buffers\r\nfor doing I/O to pages residing above max_addr.\r\nvoid blk_queue_max_hw_sectors (struct request_queue * q, unsigned int max_hw_sectors)¶\r\nset max sectors for a request for this queue\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int max_hw_sectors\r\nmax hardware sectors in the usual 512b unit\r\nDescription\r\nEnables a low level driver to set a hard upper limit, max_hw_sectors, on the size of requests.\r\nmax_hw_sectors is set by the device driver based upon the capabilities of the I/O controller.\r\nmax_dev_sectors is a hard limit imposed by the storage device for READ/WRITE requests. It is set by\r\nthe disk driver.\r\nmax_sectors is a soft limit imposed by the block layer for filesystem type requests. This value can be\r\noverridden on a per-device basis in /sys/block/\u003cdevice\u003e/queue/max_sectors_kb. The soft limit can not\r\nexceed max_hw_sectors.\r\nvoid blk_queue_chunk_sectors (struct request_queue * q, unsigned int chunk_sectors)¶\r\nset size of the chunk for this queue\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int chunk_sectors\r\nchunk sectors in the usual 512b unit\r\nDescription\r\nIf a driver doesn’t want IOs to cross a given chunk size, it can set this limit and prevent merging across\r\nchunks. Note that the chunk size must currently be a power-of-2 in sectors. Also note that the block\r\nlayer must accept a page worth of data at any offset. So if the crossing of chunks is a hard limitation in\r\nthe driver, it must still be prepared to split single page bios.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 186 of 228\n\nvoid blk_queue_max_discard_sectors (struct request_queue * q, unsigned int max_discard_sectors)¶\r\nset max sectors for a single discard\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int max_discard_sectors\r\nmaximum number of sectors to discard\r\nvoid blk_queue_max_write_same_sectors (struct request_queue * q, unsigned int max_write_same_sectors)¶\r\nset max sectors for a single write same\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int max_write_same_sectors\r\nmaximum number of sectors to write per command\r\nvoid blk_queue_max_write_zeroes_sectors (struct request_queue * q, unsigned int max_write_zeroes_sectors)¶\r\nset max sectors for a single write zeroes\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int max_write_zeroes_sectors\r\nmaximum number of sectors to write per command\r\nvoid blk_queue_max_segments (struct request_queue * q, unsigned short max_segments)¶\r\nset max hw segments for a request for this queue\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned short max_segments\r\nmax number of segments\r\nDescription\r\nEnables a low level driver to set an upper limit on the number of hw data segments in a request.\r\nvoid blk_queue_max_discard_segments (struct request_queue * q, unsigned short max_segments)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 187 of 228\n\nset max segments for discard requests\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned short max_segments\r\nmax number of segments\r\nDescription\r\nEnables a low level driver to set an upper limit on the number of segments in a discard request.\r\nvoid blk_queue_max_segment_size (struct request_queue * q, unsigned int max_size)¶\r\nset max segment size for blk_rq_map_sg\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int max_size\r\nmax size of segment in bytes\r\nDescription\r\nEnables a low level driver to set an upper limit on the size of a coalesced segment\r\nvoid blk_queue_logical_block_size (struct request_queue * q, unsigned short size)¶\r\nset logical block size for the queue\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned short size\r\nthe logical block size, in bytes\r\nDescription\r\nThis should be set to the lowest possible block size that the storage device can address. The default of\r\n512 covers most hardware.\r\nvoid blk_queue_physical_block_size (struct request_queue * q, unsigned int size)¶\r\nset physical block size for the queue\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 188 of 228\n\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int size\r\nthe physical block size, in bytes\r\nDescription\r\nThis should be set to the lowest possible sector size that the hardware can operate on without reverting\r\nto read-modify-write operations.\r\nvoid blk_queue_alignment_offset (struct request_queue * q, unsigned int offset)¶\r\nset physical block alignment offset\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int offset\r\nalignment offset in bytes\r\nDescription\r\nSome devices are naturally misaligned to compensate for things like the legacy DOS partition table 63-\r\nsector offset. Low-level drivers should call this function for devices whose first sector is not naturally\r\naligned.\r\nvoid blk_limits_io_min (struct queue_limits * limits, unsigned int min)¶\r\nset minimum request size for a device\r\nParameters\r\nstruct queue_limits * limits\r\nthe queue limits\r\nunsigned int min\r\nsmallest I/O size in bytes\r\nDescription\r\nSome devices have an internal block size bigger than the reported hardware sector size. This function\r\ncan be used to signal the smallest I/O the device can perform without incurring a performance penalty.\r\nvoid blk_queue_io_min (struct request_queue * q, unsigned int min)¶\r\nset minimum request size for the queue\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 189 of 228\n\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int min\r\nsmallest I/O size in bytes\r\nDescription\r\nStorage devices may report a granularity or preferred minimum I/O size which is the smallest request\r\nthe device can perform without incurring a performance penalty. For disk drives this is often the\r\nphysical block size. For RAID arrays it is often the stripe chunk size. A properly aligned multiple of\r\nminimum_io_size is the preferred request size for workloads where a high number of I/O operations is\r\ndesired.\r\nvoid blk_limits_io_opt (struct queue_limits * limits, unsigned int opt)¶\r\nset optimal request size for a device\r\nParameters\r\nstruct queue_limits * limits\r\nthe queue limits\r\nunsigned int opt\r\nsmallest I/O size in bytes\r\nDescription\r\nStorage devices may report an optimal I/O size, which is the device’s preferred unit for sustained I/O.\r\nThis is rarely reported for disk drives. For RAID arrays it is usually the stripe width or the internal track\r\nsize. A properly aligned multiple of optimal_io_size is the preferred request size for workloads where\r\nsustained throughput is desired.\r\nvoid blk_queue_io_opt (struct request_queue * q, unsigned int opt)¶\r\nset optimal request size for the queue\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int opt\r\noptimal request size in bytes\r\nDescription\r\nStorage devices may report an optimal I/O size, which is the device’s preferred unit for sustained I/O.\r\nThis is rarely reported for disk drives. For RAID arrays it is usually the stripe width or the internal track\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 190 of 228\n\nsize. A properly aligned multiple of optimal_io_size is the preferred request size for workloads where\r\nsustained throughput is desired.\r\nvoid blk_queue_stack_limits (struct request_queue * t, struct request_queue * b)¶\r\ninherit underlying queue limits for stacked drivers\r\nParameters\r\nstruct request_queue * t\r\nthe stacking driver (top)\r\nstruct request_queue * b\r\nthe underlying device (bottom)\r\nint blk_stack_limits (struct queue_limits * t, struct queue_limits * b, sector_t start)¶\r\nadjust queue_limits for stacked devices\r\nParameters\r\nstruct queue_limits * t\r\nthe stacking driver limits (top device)\r\nstruct queue_limits * b\r\nthe underlying queue limits (bottom, component device)\r\nsector_t start\r\nfirst data sector within component device\r\nDescription\r\nThis function is used by stacking drivers like MD and DM to ensure that all component devices have\r\ncompatible block sizes and alignments. The stacking driver must provide a queue_limits struct (top) and\r\nthen iteratively call the stacking function for all component (bottom) devices. The stacking function\r\nwill attempt to combine the values and ensure proper alignment.\r\nReturns 0 if the top and bottom queue_limits are compatible. The top device’s block sizes and\r\nalignment offsets may be adjusted to ensure alignment with the bottom device. If no compatible sizes\r\nand alignments exist, -1 is returned and the resulting top queue_limits will have the misaligned flag set\r\nto indicate that the alignment_offset is undefined.\r\nint bdev_stack_limits (struct queue_limits * t, struct block_device * bdev, sector_t start)¶\r\nadjust queue limits for stacked drivers\r\nParameters\r\nstruct queue_limits * t\r\nthe stacking driver limits (top device)\r\nstruct block_device * bdev\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 191 of 228\n\nthe component block_device (bottom)\r\nsector_t start\r\nfirst data sector within component device\r\nDescription\r\nMerges queue limits for a top device and a block_device. Returns 0 if alignment didn’t change. Returns\r\n-1 if adding the bottom device caused misalignment.\r\nvoid disk_stack_limits (struct gendisk * disk, struct block_device * bdev, sector_t offset)¶\r\nadjust queue limits for stacked drivers\r\nParameters\r\nstruct gendisk * disk\r\nMD/DM gendisk (top)\r\nstruct block_device * bdev\r\nthe underlying block device (bottom)\r\nsector_t offset\r\noffset to beginning of data within component device\r\nDescription\r\nMerges the limits for a top level gendisk and a bottom level block_device.\r\nvoid blk_queue_dma_pad (struct request_queue * q, unsigned int mask)¶\r\nset pad mask\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int mask\r\npad mask\r\nDescription\r\nSet dma pad mask.\r\nAppending pad buffer to a request modifies the last entry of a scatter list such that it includes the pad buffer.\r\nvoid blk_queue_update_dma_pad (struct request_queue * q, unsigned int mask)¶\r\nupdate pad mask\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 192 of 228\n\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int mask\r\npad mask\r\nDescription\r\nUpdate dma pad mask.\r\nAppending pad buffer to a request modifies the last entry of a scatter list such that it includes the pad buffer.\r\nint blk_queue_dma_drain (struct request_queue * q, dma_drain_needed_fn * dma_drain_needed, void * buf,\r\nunsigned int size)¶\r\nSet up a drain buffer for excess dma.\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\ndma_drain_needed_fn * dma_drain_needed\r\nfn which returns non-zero if drain is necessary\r\nvoid * buf\r\nphysically contiguous buffer\r\nunsigned int size\r\nsize of the buffer in bytes\r\nDescription\r\nSome devices have excess DMA problems and can’t simply discard (or zero fill) the unwanted piece of the\r\ntransfer. They have to have a real area of memory to transfer it into. The use case for this is ATAPI devices in\r\nDMA mode. If the packet command causes a transfer bigger than the transfer size some HBAs will lock up if there\r\naren’t DMA elements to contain the excess transfer. What this API does is adjust the queue so that the buf is\r\nalways appended silently to the scatterlist.\r\nNote\r\nThis routine adjusts max_hw_segments to make room for appending the drain buffer. If you call\r\nblk_queue_max_segments() after calling this routine, you must set the limit to one fewer than your device can\r\nsupport otherwise there won’t be room for the drain buffer.\r\nvoid blk_queue_segment_boundary (struct request_queue * q, unsigned long mask)¶\r\nset boundary rules for segment merging\r\nParameters\r\nstruct request_queue * q\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 193 of 228\n\nthe request queue for the device\r\nunsigned long mask\r\nthe memory boundary mask\r\nvoid blk_queue_virt_boundary (struct request_queue * q, unsigned long mask)¶\r\nset boundary rules for bio merging\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned long mask\r\nthe memory boundary mask\r\nvoid blk_queue_dma_alignment (struct request_queue * q, int mask)¶\r\nset dma length and memory alignment\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nint mask\r\nalignment mask\r\nDescription\r\nset required memory and length alignment for direct dma transactions. this is used when building direct\r\nio requests for the queue.\r\nvoid blk_queue_update_dma_alignment (struct request_queue * q, int mask)¶\r\nupdate dma length and memory alignment\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nint mask\r\nalignment mask\r\nDescription\r\nupdate required memory and length alignment for direct dma transactions. If the requested alignment is\r\nlarger than the current alignment, then the current queue alignment is updated to the new value,\r\notherwise it is left alone. The design of this is to allow multiple objects (driver, device, transport etc) to\r\nset their respective alignments without having them interfere.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 194 of 228\n\nvoid blk_set_queue_depth (struct request_queue * q, unsigned int depth)¶\r\ntell the block layer about the device queue depth\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nunsigned int depth\r\nqueue depth\r\nvoid blk_queue_write_cache (struct request_queue * q, bool wc, bool fua)¶\r\nconfigure queue’s write cache\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nbool wc\r\nwrite back cache on or off\r\nbool fua\r\ndevice supports FUA writes, if true\r\nDescription\r\nTell the block layer about the write cache of q.\r\nvoid blk_execute_rq_nowait (struct request_queue * q, struct gendisk * bd_disk, struct request * rq, int at_head,\r\nrq_end_io_fn * done)¶\r\ninsert a request into queue for execution\r\nParameters\r\nstruct request_queue * q\r\nqueue to insert the request in\r\nstruct gendisk * bd_disk\r\nmatching gendisk\r\nstruct request * rq\r\nrequest to insert\r\nint at_head\r\ninsert request at head or tail of queue\r\nrq_end_io_fn * done\r\nI/O completion handler\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 195 of 228\n\nInsert a fully prepared request at the back of the I/O scheduler queue for execution. Don’t wait for\r\ncompletion.\r\nNote\r\nThis function will invoke done directly if the queue is dead.\r\nvoid blk_execute_rq (struct request_queue * q, struct gendisk * bd_disk, struct request * rq, int at_head)¶\r\ninsert a request into queue for execution\r\nParameters\r\nstruct request_queue * q\r\nqueue to insert the request in\r\nstruct gendisk * bd_disk\r\nmatching gendisk\r\nstruct request * rq\r\nrequest to insert\r\nint at_head\r\ninsert request at head or tail of queue\r\nDescription\r\nInsert a fully prepared request at the back of the I/O scheduler queue for execution and wait for\r\ncompletion.\r\nint blkdev_issue_flush (struct block_device * bdev, gfp_t gfp_mask, sector_t * error_sector)¶\r\nqueue a flush\r\nParameters\r\nstruct block_device * bdev\r\nblockdev to issue flush for\r\ngfp_t gfp_mask\r\nmemory allocation flags (for bio_alloc)\r\nsector_t * error_sector\r\nerror sector\r\nDescription\r\nIssue a flush for the block device in question. Caller can supply room for storing the error offset in case\r\nof a flush error, if they wish to.\r\nint blkdev_issue_discard (struct block_device * bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask,\r\nunsigned long flags)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 196 of 228\n\nqueue a discard\r\nParameters\r\nstruct block_device * bdev\r\nblockdev to issue discard for\r\nsector_t sector\r\nstart sector\r\nsector_t nr_sects\r\nnumber of sectors to discard\r\ngfp_t gfp_mask\r\nmemory allocation flags (for bio_alloc)\r\nunsigned long flags\r\nBLKDEV_DISCARD_* flags to control behaviour\r\nDescription\r\nIssue a discard request for the sectors in question.\r\nint blkdev_issue_write_same (struct block_device * bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask,\r\nstruct page * page)¶\r\nqueue a write same operation\r\nParameters\r\nstruct block_device * bdev\r\ntarget blockdev\r\nsector_t sector\r\nstart sector\r\nsector_t nr_sects\r\nnumber of sectors to write\r\ngfp_t gfp_mask\r\nmemory allocation flags (for bio_alloc)\r\nstruct page * page\r\npage containing data\r\nDescription\r\nIssue a write same request for the sectors in question.\r\nint __blkdev_issue_zeroout (struct block_device * bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask,\r\nstruct bio ** biop, unsigned flags)¶\r\ngenerate number of zero filed write bios\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 197 of 228\n\nstruct block_device * bdev\r\nblockdev to issue\r\nsector_t sector\r\nstart sector\r\nsector_t nr_sects\r\nnumber of sectors to write\r\ngfp_t gfp_mask\r\nmemory allocation flags (for bio_alloc)\r\nstruct bio ** biop\r\npointer to anchor bio\r\nunsigned flags\r\ncontrols detailed behavior\r\nDescription\r\nZero-fill a block range, either using hardware offload or by explicitly writing zeroes to the device.\r\nNote that this function may fail with -EOPNOTSUPP if the driver signals zeroing offload support, but\r\nthe device fails to process the command (for some devices there is no non-destructive way to verify\r\nwhether this operation is actually supported). In this case the caller should call retry the call to\r\nblkdev_issue_zeroout() and the fallback path will be used.\r\nIf a device is using logical block provisioning, the underlying space will not be released if flags\r\ncontains BLKDEV_ZERO_NOUNMAP.\r\nIf flags contains BLKDEV_ZERO_NOFALLBACK, the function will return -EOPNOTSUPP if no\r\nexplicit hardware offload for zeroing is provided.\r\nint blkdev_issue_zeroout (struct block_device * bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask,\r\nunsigned flags)¶\r\nzero-fill a block range\r\nParameters\r\nstruct block_device * bdev\r\nblockdev to write\r\nsector_t sector\r\nstart sector\r\nsector_t nr_sects\r\nnumber of sectors to write\r\ngfp_t gfp_mask\r\nmemory allocation flags (for bio_alloc)\r\nunsigned flags\r\ncontrols detailed behavior\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 198 of 228\n\nDescription\r\nZero-fill a block range, either using hardware offload or by explicitly writing zeroes to the device. See\r\n__blkdev_issue_zeroout() for the valid values for flags .\r\nstruct request * blk_queue_find_tag (struct request_queue * q, int tag)¶\r\nfind a request by its tag and queue\r\nParameters\r\nstruct request_queue * q\r\nThe request queue for the device\r\nint tag\r\nThe tag of the request\r\nNotes\r\nShould be used when a device returns a tag and you want to match it with a request.\r\nno locks need be held.\r\nvoid blk_free_tags (struct blk_queue_tag * bqt)¶\r\nrelease a given set of tag maintenance info\r\nParameters\r\nstruct blk_queue_tag * bqt\r\nthe tag map to free\r\nDescription\r\nDrop the reference count on bqt and frees it when the last reference is dropped.\r\nvoid blk_queue_free_tags (struct request_queue * q)¶\r\nrelease tag maintenance info\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nNotes\r\nThis is used to disable tagged queuing to a device, yet leave queue in function.\r\nstruct blk_queue_tag * blk_init_tags (int depth, int alloc_policy)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 199 of 228\n\ninitialize the tag info for an external tag map\r\nParameters\r\nint depth\r\nthe maximum queue depth supported\r\nint alloc_policy\r\ntag allocation policy\r\nint blk_queue_init_tags (struct request_queue * q, int depth, struct blk_queue_tag * tags, int alloc_policy)¶\r\ninitialize the queue tag info\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nint depth\r\nthe maximum queue depth supported\r\nstruct blk_queue_tag * tags\r\nthe tag to use\r\nint alloc_policy\r\ntag allocation policy\r\nDescription\r\nQueue lock must be held here if the function is called to resize an existing map.\r\nint blk_queue_resize_tags (struct request_queue * q, int new_depth)¶\r\nchange the queueing depth\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nint new_depth\r\nthe new max command queueing depth\r\nNotes\r\nMust be called with the queue lock held.\r\nvoid blk_queue_end_tag (struct request_queue * q, struct request * rq)¶\r\nend tag operations for a request\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 200 of 228\n\nstruct request_queue * q\r\nthe request queue for the device\r\nstruct request * rq\r\nthe request that has completed\r\nDescription\r\nTypically called when end_that_request_first() returns 0 , meaning all transfers have been done\r\nfor a request. It’s important to call this function before end_that_request_last() , as that will put the\r\nrequest back on the free list thus corrupting the internal tag list.\r\nNotes\r\nqueue lock must be held.\r\nint blk_queue_start_tag (struct request_queue * q, struct request * rq)¶\r\nfind a free tag and assign it\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nstruct request * rq\r\nthe block request that needs tagging\r\nDescription\r\nThis can either be used as a stand-alone helper, or possibly be assigned as the queue prep_rq_fn (in\r\nwhich case struct request automagically gets a tag assigned). Note that this function assumes that\r\nany type of request can be queued! if this is not true for your device, you must check the request type\r\nbefore calling this function. The request will also be removed from the request queue, so it’s the drivers\r\nresponsibility to readd it if it should need to be restarted for some reason.\r\nNotes\r\nqueue lock must be held.\r\nvoid blk_queue_invalidate_tags (struct request_queue * q)¶\r\ninvalidate all pending tags\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 201 of 228\n\nHardware conditions may dictate a need to stop all pending requests. In this case, we will safely clear\r\nthe block side of the tag queue and readd all requests to the request queue in the right order.\r\nNotes\r\nqueue lock must be held.\r\nvoid __blk_queue_free_tags (struct request_queue * q)¶\r\nrelease tag maintenance info\r\nParameters\r\nstruct request_queue * q\r\nthe request queue for the device\r\nNotes\r\nblk_cleanup_queue() will take care of calling this function, if tagging has been used. So there’s no\r\nneed to call this directly.\r\nint blk_rq_count_integrity_sg (struct request_queue * q, struct bio * bio)¶\r\nCount number of integrity scatterlist elements\r\nParameters\r\nstruct request_queue * q\r\nrequest queue\r\nstruct bio * bio\r\nbio with integrity metadata attached\r\nDescription\r\nReturns the number of elements required in a scatterlist corresponding to the integrity metadata in a bio.\r\nint blk_rq_map_integrity_sg (struct request_queue * q, struct bio * bio, struct scatterlist * sglist)¶\r\nMap integrity metadata into a scatterlist\r\nParameters\r\nstruct request_queue * q\r\nrequest queue\r\nstruct bio * bio\r\nbio with integrity metadata attached\r\nstruct scatterlist * sglist\r\ntarget scatterlist\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 202 of 228\n\nDescription\r\nMap the integrity vectors in request into a scatterlist. The scatterlist must be big enough to hold all elements. I.e.\r\nsized using blk_rq_count_integrity_sg() .\r\nint blk_integrity_compare (struct gendisk * gd1, struct gendisk * gd2)¶\r\nCompare integrity profile of two disks\r\nParameters\r\nstruct gendisk * gd1\r\nDisk to compare\r\nstruct gendisk * gd2\r\nDisk to compare\r\nDescription\r\nMeta-devices like DM and MD need to verify that all sub-devices use the same integrity format before advertising\r\nto upper layers that they can send/receive integrity metadata. This function can be used to check whether two\r\ngendisk devices have compatible integrity formats.\r\nvoid blk_integrity_register (struct gendisk * disk, struct blk_integrity * template)¶\r\nRegister a gendisk as being integrity-capable\r\nParameters\r\nstruct gendisk * disk\r\nstruct gendisk pointer to make integrity-aware\r\nstruct blk_integrity * template\r\nblock integrity profile to register\r\nDescription\r\nWhen a device needs to advertise itself as being able to send/receive integrity metadata it must use this function to\r\nregister the capability with the block layer. The template is a blk_integrity struct with values appropriate for the\r\nunderlying hardware. See Documentation/block/data-integrity.txt.\r\nvoid blk_integrity_unregister (struct gendisk * disk)¶\r\nUnregister block integrity profile\r\nParameters\r\nstruct gendisk * disk\r\ndisk whose integrity profile to unregister\r\nDescription\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 203 of 228\n\nThis function unregisters the integrity capability from a block device.\r\nint blk_trace_ioctl (struct block_device * bdev, unsigned cmd, char __user * arg)¶\r\nhandle the ioctls associated with tracing\r\nParameters\r\nstruct block_device * bdev\r\nthe block device\r\nunsigned cmd\r\nthe ioctl cmd\r\nchar __user * arg\r\nthe argument data, if any\r\nvoid blk_trace_shutdown (struct request_queue * q)¶\r\nstop and cleanup trace structures\r\nParameters\r\nstruct request_queue * q\r\nthe request queue associated with the device\r\nvoid blk_add_trace_rq (struct request * rq, int error, unsigned int nr_bytes, u32 what)¶\r\nAdd a trace for a request oriented action\r\nParameters\r\nstruct request * rq\r\nthe source request\r\nint error\r\nreturn status to log\r\nunsigned int nr_bytes\r\nnumber of completed bytes\r\nu32 what\r\nthe action\r\nDescription\r\nRecords an action against a request. Will log the bio offset + size.\r\nvoid blk_add_trace_bio (struct request_queue * q, struct bio * bio, u32 what, int error)¶\r\nAdd a trace for a bio oriented action\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 204 of 228\n\nstruct request_queue * q\r\nqueue the io is for\r\nstruct bio * bio\r\nthe source bio\r\nu32 what\r\nthe action\r\nint error\r\nerror, if any\r\nDescription\r\nRecords an action against a bio. Will log the bio offset + size.\r\nvoid blk_add_trace_bio_remap (void * ignore, struct request_queue * q, struct bio * bio, dev_t dev,\r\nsector_t from)¶\r\nAdd a trace for a bio-remap operation\r\nParameters\r\nvoid * ignore\r\ntrace callback data parameter (not used)\r\nstruct request_queue * q\r\nqueue the io is for\r\nstruct bio * bio\r\nthe source bio\r\ndev_t dev\r\ntarget device\r\nsector_t from\r\nsource sector\r\nDescription\r\nDevice mapper or raid target sometimes need to split a bio because it spans a stripe (or similar). Add a\r\ntrace for that action.\r\nvoid blk_add_trace_rq_remap (void * ignore, struct request_queue * q, struct request * rq, dev_t dev,\r\nsector_t from)¶\r\nAdd a trace for a request-remap operation\r\nParameters\r\nvoid * ignore\r\ntrace callback data parameter (not used)\r\nstruct request_queue * q\r\nqueue the io is for\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 205 of 228\n\nstruct request * rq\r\nthe source request\r\ndev_t dev\r\ntarget device\r\nsector_t from\r\nsource sector\r\nDescription\r\nDevice mapper remaps request to other devices. Add a trace for that action.\r\nint blk_mangle_minor (int minor)¶\r\nscatter minor numbers apart\r\nParameters\r\nint minor\r\nminor number to mangle\r\nDescription\r\nScatter consecutively allocated minor number apart if MANGLE_DEVT is enabled. Mangling twice gives the\r\noriginal value.\r\nReturn\r\nMangled value.\r\nContext\r\nDon’t care.\r\nint blk_alloc_devt (struct hd_struct * part, dev_t * devt)¶\r\nallocate a dev_t for a partition\r\nParameters\r\nstruct hd_struct * part\r\npartition to allocate dev_t for\r\ndev_t * devt\r\nout parameter for resulting dev_t\r\nDescription\r\nAllocate a dev_t for block device.\r\nReturn\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 206 of 228\n\n0 on success, allocated dev_t is returned in *devt. -errno on failure.\r\nContext\r\nMight sleep.\r\nvoid blk_free_devt (dev_t devt)¶\r\nfree a dev_t\r\nParameters\r\ndev_t devt\r\ndev_t to free\r\nDescription\r\nFree devt which was allocated using blk_alloc_devt() .\r\nContext\r\nMight sleep.\r\nvoid disk_replace_part_tbl (struct gendisk * disk, struct disk_part_tbl * new_ptbl)¶\r\nreplace disk-\u003epart_tbl in RCU-safe way\r\nParameters\r\nstruct gendisk * disk\r\ndisk to replace part_tbl for\r\nstruct disk_part_tbl * new_ptbl\r\nnew part_tbl to install\r\nDescription\r\nReplace disk-\u003epart_tbl with new_ptbl in RCU-safe way. The original ptbl is freed using RCU callback.\r\nLOCKING: Matching bd_mutx locked.\r\nint disk_expand_part_tbl (struct gendisk * disk, int partno)¶\r\nexpand disk-\u003epart_tbl\r\nParameters\r\nstruct gendisk * disk\r\ndisk to expand part_tbl for\r\nint partno\r\nexpand such that this partno can fit in\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 207 of 228\n\nDescription\r\nExpand disk-\u003epart_tbl such that partno can fit in. disk-\u003epart_tbl uses RCU to allow unlocked dereferencing for\r\nstats and other stuff.\r\nLOCKING: Matching bd_mutex locked, might sleep.\r\nReturn\r\n0 on success, -errno on failure.\r\nvoid disk_block_events (struct gendisk * disk)¶\r\nblock and flush disk event checking\r\nParameters\r\nstruct gendisk * disk\r\ndisk to block events for\r\nDescription\r\nOn return from this function, it is guaranteed that event checking isn’t in progress and won’t happen until\r\nunblocked by disk_unblock_events() . Events blocking is counted and the actual unblocking happens after the\r\nmatching number of unblocks are done.\r\nNote that this intentionally does not block event checking from disk_clear_events() .\r\nContext\r\nMight sleep.\r\nvoid disk_unblock_events (struct gendisk * disk)¶\r\nunblock disk event checking\r\nParameters\r\nstruct gendisk * disk\r\ndisk to unblock events for\r\nDescription\r\nUndo disk_block_events() . When the block count reaches zero, it starts events polling if configured.\r\nContext\r\nDon’t care. Safe to call from irq context.\r\nvoid disk_flush_events (struct gendisk * disk, unsigned int mask)¶\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 208 of 228\n\nschedule immediate event checking and flushing\r\nParameters\r\nstruct gendisk * disk\r\ndisk to check and flush events for\r\nunsigned int mask\r\nevents to flush\r\nDescription\r\nSchedule immediate event checking on disk if not blocked. Events in mask are scheduled to be cleared from the\r\ndriver. Note that this doesn’t clear the events from disk-\u003eev.\r\nContext\r\nIf mask is non-zero must be called with bdev-\u003ebd_mutex held.\r\nunsigned int disk_clear_events (struct gendisk * disk, unsigned int mask)¶\r\nsynchronously check, clear and return pending events\r\nParameters\r\nstruct gendisk * disk\r\ndisk to fetch and clear events from\r\nunsigned int mask\r\nmask of events to be fetched and cleared\r\nDescription\r\nDisk events are synchronously checked and pending events in mask are cleared and returned. This ignores the\r\nblock count.\r\nContext\r\nMight sleep.\r\nstruct hd_struct * disk_get_part (struct gendisk * disk, int partno)¶\r\nget partition\r\nParameters\r\nstruct gendisk * disk\r\ndisk to look partition from\r\nint partno\r\npartition number\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 209 of 228\n\nDescription\r\nLook for partition partno from disk. If found, increment reference count and return it.\r\nContext\r\nDon’t care.\r\nReturn\r\nPointer to the found partition on success, NULL if not found.\r\nvoid disk_part_iter_init (struct disk_part_iter * piter, struct gendisk * disk, unsigned int flags)¶\r\ninitialize partition iterator\r\nParameters\r\nstruct disk_part_iter * piter\r\niterator to initialize\r\nstruct gendisk * disk\r\ndisk to iterate over\r\nunsigned int flags\r\nDISK_PITER_* flags\r\nDescription\r\nInitialize piter so that it iterates over partitions of disk.\r\nContext\r\nDon’t care.\r\nstruct hd_struct * disk_part_iter_next (struct disk_part_iter * piter)¶\r\nproceed iterator to the next partition and return it\r\nParameters\r\nstruct disk_part_iter * piter\r\niterator of interest\r\nDescription\r\nProceed piter to the next partition and return it.\r\nContext\r\nDon’t care.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 210 of 228\n\nvoid disk_part_iter_exit (struct disk_part_iter * piter)¶\r\nfinish up partition iteration\r\nParameters\r\nstruct disk_part_iter * piter\r\niter of interest\r\nDescription\r\nCalled when iteration is over. Cleans up piter.\r\nContext\r\nDon’t care.\r\nstruct hd_struct * disk_map_sector_rcu (struct gendisk * disk, sector_t sector)¶\r\nmap sector to partition\r\nParameters\r\nstruct gendisk * disk\r\ngendisk of interest\r\nsector_t sector\r\nsector to map\r\nDescription\r\nFind out which partition sector maps to on disk. This is primarily used for stats accounting.\r\nContext\r\nRCU read locked. The returned partition pointer is valid only while preemption is disabled.\r\nReturn\r\nFound partition on success, part0 is returned if no partition matches\r\nint register_blkdev (unsigned int major, const char * name)¶\r\nregister a new block device\r\nParameters\r\nunsigned int major\r\nthe requested major device number [1..255]. If major = 0, try to allocate any unused major number.\r\nconst char * name\r\nthe name of the new block device as a zero terminated string\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 211 of 228\n\nDescription\r\nThe name must be unique within the system.\r\nThe return value depends on the major input parameter:\r\nif a major device number was requested in range [1..255] then the function returns zero on\r\nsuccess, or a negative error code\r\nif any unused major number was requested with major = 0 parameter then the return value is the\r\nallocated major number in range [1..255] or a negative error code otherwise\r\nvoid device_add_disk (struct device * parent, struct gendisk * disk)¶\r\nadd partitioning information to kernel list\r\nParameters\r\nstruct device * parent\r\nparent device for the disk\r\nstruct gendisk * disk\r\nper-device partitioning information\r\nDescription\r\nThis function registers the partitioning information in disk with the kernel.\r\nFIXME: error handling\r\nstruct gendisk * get_gendisk (dev_t devt, int * partno)¶\r\nget partitioning information for a given device\r\nParameters\r\ndev_t devt\r\ndevice to get partitioning information for\r\nint * partno\r\nreturned partition index\r\nDescription\r\nThis function gets the structure containing partitioning information for the given device devt.\r\nstruct block_device * bdget_disk (struct gendisk * disk, int partno)¶\r\ndo bdget() by gendisk and partition number\r\nParameters\r\nstruct gendisk * disk\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 212 of 228\n\ngendisk of interest\r\nint partno\r\npartition number\r\nDescription\r\nFind partition partno from disk, do bdget() on it.\r\nContext\r\nDon’t care.\r\nReturn\r\nResulting block_device on success, NULL on failure.\r\nChar devices¶\r\nint register_chrdev_region (dev_t from, unsigned count, const char * name)¶\r\nregister a range of device numbers\r\nParameters\r\ndev_t from\r\nthe first in the desired range of device numbers; must include the major number.\r\nunsigned count\r\nthe number of consecutive device numbers required\r\nconst char * name\r\nthe name of the device or driver.\r\nDescription\r\nReturn value is zero on success, a negative error code on failure.\r\nint alloc_chrdev_region (dev_t * dev, unsigned baseminor, unsigned count, const char * name)¶\r\nregister a range of char device numbers\r\nParameters\r\ndev_t * dev\r\noutput parameter for first assigned number\r\nunsigned baseminor\r\nfirst of the requested range of minor numbers\r\nunsigned count\r\nthe number of minor numbers required\r\nconst char * name\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 213 of 228\n\nthe name of the associated device or driver\r\nDescription\r\nAllocates a range of char device numbers. The major number will be chosen dynamically, and returned (along\r\nwith the first minor number) in dev. Returns zero or a negative error code.\r\nint __register_chrdev (unsigned int major, unsigned int baseminor, unsigned int count, const char * name, const\r\nstruct file_operations * fops)¶\r\ncreate and register a cdev occupying a range of minors\r\nParameters\r\nunsigned int major\r\nmajor device number or 0 for dynamic allocation\r\nunsigned int baseminor\r\nfirst of the requested range of minor numbers\r\nunsigned int count\r\nthe number of minor numbers required\r\nconst char * name\r\nname of this range of devices\r\nconst struct file_operations * fops\r\nfile operations associated with this devices\r\nDescription\r\nIf major == 0 this functions will dynamically allocate a major and return its number.\r\nIf major \u003e 0 this function will attempt to reserve a device with the given major number and will return zero on\r\nsuccess.\r\nReturns a -ve errno on failure.\r\nThe name of this device has nothing to do with the name of the device in /dev. It only helps to keep track of the\r\ndifferent owners of devices. If your module name has only one type of devices it’s ok to use e.g. the name of the\r\nmodule here.\r\nvoid unregister_chrdev_region (dev_t from, unsigned count)¶\r\nunregister a range of device numbers\r\nParameters\r\ndev_t from\r\nthe first in the range of numbers to unregister\r\nunsigned count\r\nthe number of device numbers to unregister\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 214 of 228\n\nDescription\r\nThis function will unregister a range of count device numbers, starting with from. The caller should normally be\r\nthe one who allocated those numbers in the first place...\r\nvoid __unregister_chrdev (unsigned int major, unsigned int baseminor, unsigned int count, const char * name)¶\r\nunregister and destroy a cdev\r\nParameters\r\nunsigned int major\r\nmajor device number\r\nunsigned int baseminor\r\nfirst of the range of minor numbers\r\nunsigned int count\r\nthe number of minor numbers this cdev is occupying\r\nconst char * name\r\nname of this range of devices\r\nDescription\r\nUnregister and destroy the cdev occupying the region described by major, baseminor and count. This function\r\nundoes what __register_chrdev() did.\r\nint cdev_add (struct cdev * p, dev_t dev, unsigned count)¶\r\nadd a char device to the system\r\nParameters\r\nstruct cdev * p\r\nthe cdev structure for the device\r\ndev_t dev\r\nthe first device number for which this device is responsible\r\nunsigned count\r\nthe number of consecutive minor numbers corresponding to this device\r\nDescription\r\ncdev_add() adds the device represented by p to the system, making it live immediately. A negative error code is\r\nreturned on failure.\r\nvoid cdev_set_parent (struct cdev * p, struct kobject * kobj)¶\r\nset the parent kobject for a char device\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 215 of 228\n\nstruct cdev * p\r\nthe cdev structure\r\nstruct kobject * kobj\r\nthe kobject to take a reference to\r\nDescription\r\ncdev_set_parent() sets a parent kobject which will be referenced appropriately so the parent is not freed before\r\nthe cdev. This should be called before cdev_add.\r\nint cdev_device_add (struct cdev * cdev, struct device * dev)¶\r\nadd a char device and it’s corresponding struct device, linkink\r\nParameters\r\nstruct cdev * cdev\r\nthe cdev structure\r\nstruct device * dev\r\nthe device structure\r\nDescription\r\ncdev_device_add() adds the char device represented by cdev to the system, just as cdev_add does. It then adds\r\ndev to the system using device_add The dev_t for the char device will be taken from the struct device which needs\r\nto be initialized first. This helper function correctly takes a reference to the parent device so the parent will not get\r\nreleased until all references to the cdev are released.\r\nThis helper uses dev-\u003edevt for the device number. If it is not set it will not add the cdev and it will be equivalent\r\nto device_add.\r\nThis function should be used whenever the struct cdev and the struct device are members of the same structure\r\nwhose lifetime is managed by the struct device.\r\nNOTE\r\nCallers must assume that userspace was able to open the cdev and can call cdev fops callbacks at any time, even if\r\nthis function fails.\r\nvoid cdev_device_del (struct cdev * cdev, struct device * dev)¶\r\ninverse of cdev_device_add\r\nParameters\r\nstruct cdev * cdev\r\nthe cdev structure\r\nstruct device * dev\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 216 of 228\n\nthe device structure\r\nDescription\r\ncdev_device_del() is a helper function to call cdev_del and device_del. It should be used whenever\r\ncdev_device_add is used.\r\nIf dev-\u003edevt is not set it will not remove the cdev and will be equivalent to device_del.\r\nNOTE\r\nThis guarantees that associated sysfs callbacks are not running or runnable, however any cdevs already open will\r\nremain and their fops will still be callable even after this function returns.\r\nvoid cdev_del (struct cdev * p)¶\r\nremove a cdev from the system\r\nParameters\r\nstruct cdev * p\r\nthe cdev structure to be removed\r\nDescription\r\ncdev_del() removes p from the system, possibly freeing the structure itself.\r\nNOTE\r\nThis guarantees that cdev device will no longer be able to be opened, however any cdevs already open will remain\r\nand their fops will still be callable even after cdev_del returns.\r\nstruct cdev * cdev_alloc (void)¶\r\nallocate a cdev structure\r\nParameters\r\nvoid\r\nno arguments\r\nDescription\r\nAllocates and returns a cdev structure, or NULL on failure.\r\nvoid cdev_init (struct cdev * cdev, const struct file_operations * fops)¶\r\ninitialize a cdev structure\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 217 of 228\n\nstruct cdev * cdev\r\nthe structure to initialize\r\nconst struct file_operations * fops\r\nthe file_operations for this device\r\nDescription\r\nInitializes cdev, remembering fops, making it ready to add to the system with cdev_add() .\r\nClock Framework¶\r\nThe clock framework defines programming interfaces to support software management of the system clock tree.\r\nThis framework is widely used with System-On-Chip (SOC) platforms to support power management and various\r\ndevices which may need custom clock rates. Note that these “clocks” don’t relate to timekeeping or real time\r\nclocks (RTCs), each of which have separate frameworks. These struct clk instances may be used to manage\r\nfor example a 96 MHz signal that is used to shift bits into and out of peripherals or busses, or otherwise trigger\r\nsynchronous state machine transitions in system hardware.\r\nPower management is supported by explicit software clock gating: unused clocks are disabled, so the system\r\ndoesn’t waste power changing the state of transistors that aren’t in active use. On some systems this may be\r\nbacked by hardware clock gating, where clocks are gated without being disabled in software. Sections of chips\r\nthat are powered but not clocked may be able to retain their last state. This low power state is often called a\r\nretention mode. This mode still incurs leakage currents, especially with finer circuit geometries, but for CMOS\r\ncircuits power is mostly used by clocked state changes.\r\nPower-aware drivers only enable their clocks when the device they manage is in active use. Also, system sleep\r\nstates often differ according to which clock domains are active: while a “standby” state may allow wakeup from\r\nseveral active domains, a “mem” (suspend-to-RAM) state may require a more wholesale shutdown of clocks\r\nderived from higher speed PLLs and oscillators, limiting the number of possible wakeup event sources. A driver’s\r\nsuspend method may need to be aware of system-specific clock constraints on the target sleep state.\r\nSome platforms support programmable clock generators. These can be used by external chips of various kinds,\r\nsuch as other CPUs, multimedia codecs, and devices with strict requirements for interface clocking.\r\nstruct clk_notifier ¶\r\nassociate a clk with a notifier\r\nDefinition\r\nstruct clk_notifier {\r\n struct clk * clk;\r\n struct srcu_notifier_head notifier_head;\r\n struct list_head node;\r\n};\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 218 of 228\n\nMembers\r\nclk\r\nstruct clk * to associate the notifier with\r\nnotifier_head\r\na blocking_notifier_head for this clk\r\nnode\r\nlinked list pointers\r\nDescription\r\nA list of struct clk_notifier is maintained by the notifier code. An entry is created whenever code registers the first\r\nnotifier on a particular clk. Future notifiers on that clk are added to the notifier_head.\r\nstruct clk_notifier_data ¶\r\nrate data to pass to the notifier callback\r\nDefinition\r\nstruct clk_notifier_data {\r\n struct clk * clk;\r\n unsigned long old_rate;\r\n unsigned long new_rate;\r\n};\r\nMembers\r\nclk\r\nstruct clk * being changed\r\nold_rate\r\nprevious rate of this clk\r\nnew_rate\r\nnew rate of this clk\r\nDescription\r\nFor a pre-notifier, old_rate is the clk’s rate before this rate change, and new_rate is what the rate will be in the\r\nfuture. For a post-notifier, old_rate and new_rate are both set to the clk’s current rate (this was done to optimize\r\nthe implementation).\r\nint clk_notifier_register (struct clk * clk, struct notifier_block * nb)¶\r\nchange notifier callback\r\nParameters\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 219 of 228\n\nstruct clk * clk\r\nclock whose rate we are interested in\r\nstruct notifier_block * nb\r\nnotifier block with callback function pointer\r\nDescription\r\nProTip: debugging across notifier chains can be frustrating. Make sure that your notifier callback function prints a\r\nnice big warning in case of failure.\r\nint clk_notifier_unregister (struct clk * clk, struct notifier_block * nb)¶\r\nchange notifier callback\r\nParameters\r\nstruct clk * clk\r\nclock whose rate we are no longer interested in\r\nstruct notifier_block * nb\r\nnotifier block which will be unregistered\r\nlong clk_get_accuracy (struct clk * clk)¶\r\nobtain the clock accuracy in ppb (parts per billion) for a clock source.\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nDescription\r\nThis gets the clock source accuracy expressed in ppb. A perfect clock returns 0.\r\nint clk_set_phase (struct clk * clk, int degrees)¶\r\nadjust the phase shift of a clock signal\r\nParameters\r\nstruct clk * clk\r\nclock signal source\r\nint degrees\r\nnumber of degrees the signal is shifted\r\nDescription\r\nShifts the phase of a clock signal by the specified degrees. Returns 0 on success, -EERROR otherwise.\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 220 of 228\n\nint clk_get_phase (struct clk * clk)¶\r\nreturn the phase shift of a clock signal\r\nParameters\r\nstruct clk * clk\r\nclock signal source\r\nDescription\r\nReturns the phase shift of a clock node in degrees, otherwise returns -EERROR.\r\nbool clk_is_match (const struct clk * p, const struct clk * q)¶\r\ncheck if two clk’s point to the same hardware clock\r\nParameters\r\nconst struct clk * p\r\nclk compared against q\r\nconst struct clk * q\r\nclk compared against p\r\nDescription\r\nReturns true if the two struct clk pointers both point to the same hardware clock node. Put differently, returns true\r\nif p and q share the same struct clk_core object.\r\nReturns false otherwise. Note that two NULL clks are treated as matching.\r\nint clk_prepare (struct clk * clk)¶\r\nprepare a clock source\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nDescription\r\nThis prepares the clock source for use.\r\nMust not be called from within atomic context.\r\nvoid clk_unprepare (struct clk * clk)¶\r\nundo preparation of a clock source\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 221 of 228\n\nParameters\r\nstruct clk * clk\r\nclock source\r\nDescription\r\nThis undoes a previously prepared clock. The caller must balance the number of prepare and unprepare calls.\r\nMust not be called from within atomic context.\r\nstruct clk * clk_get (struct device * dev, const char * id)¶\r\nlookup and obtain a reference to a clock producer.\r\nParameters\r\nstruct device * dev\r\ndevice for clock “consumer”\r\nconst char * id\r\nclock consumer ID\r\nDescription\r\nReturns a struct clk corresponding to the clock producer, or valid IS_ERR() condition containing errno. The\r\nimplementation uses dev and id to determine the clock consumer, and thereby the clock producer. (IOW, id may\r\nbe identical strings, but clk_get may return different clock producers depending on dev.)\r\nDrivers must assume that the clock source is not enabled.\r\nclk_get should not be called from within interrupt context.\r\nstruct clk * devm_clk_get (struct device * dev, const char * id)¶\r\nlookup and obtain a managed reference to a clock producer.\r\nParameters\r\nstruct device * dev\r\ndevice for clock “consumer”\r\nconst char * id\r\nclock consumer ID\r\nDescription\r\nReturns a struct clk corresponding to the clock producer, or valid IS_ERR() condition containing errno. The\r\nimplementation uses dev and id to determine the clock consumer, and thereby the clock producer. (IOW, id may\r\nbe identical strings, but clk_get may return different clock producers depending on dev.)\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 222 of 228\n\nDrivers must assume that the clock source is not enabled.\r\ndevm_clk_get should not be called from within interrupt context.\r\nThe clock will automatically be freed when the device is unbound from the bus.\r\nstruct clk * devm_get_clk_from_child (struct device * dev, struct device_node * np, const char * con_id)¶\r\nlookup and obtain a managed reference to a clock producer from child node.\r\nParameters\r\nstruct device * dev\r\ndevice for clock “consumer”\r\nstruct device_node * np\r\npointer to clock consumer node\r\nconst char * con_id\r\nclock consumer ID\r\nDescription\r\nThis function parses the clocks, and uses them to look up the struct clk from the registered list of clock providers\r\nby using np and con_id\r\nThe clock will automatically be freed when the device is unbound from the bus.\r\nint clk_enable (struct clk * clk)¶\r\ninform the system when the clock source should be running.\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nDescription\r\nIf the clock can not be enabled/disabled, this should return success.\r\nMay be called from atomic contexts.\r\nReturns success (0) or negative errno.\r\nvoid clk_disable (struct clk * clk)¶\r\ninform the system when the clock source is no longer required.\r\nParameters\r\nstruct clk * clk\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 223 of 228\n\nclock source\r\nDescription\r\nInform the system that a clock source is no longer required by a driver and may be shut down.\r\nMay be called from atomic contexts.\r\nImplementation detail: if the clock source is shared between multiple drivers, clk_enable() calls must be\r\nbalanced by the same number of clk_disable() calls for the clock source to be disabled.\r\nunsigned long clk_get_rate (struct clk * clk)¶\r\nobtain the current clock rate (in Hz) for a clock source. This is only valid once the clock source has been\r\nenabled.\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nvoid clk_put (struct clk * clk)¶\r\n“free” the clock source\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nNote\r\ndrivers must ensure that all clk_enable calls made on this clock source are balanced by clk_disable calls prior to\r\ncalling this function.\r\nclk_put should not be called from within interrupt context.\r\nvoid devm_clk_put (struct device * dev, struct clk * clk)¶\r\n“free” a managed clock source\r\nParameters\r\nstruct device * dev\r\ndevice used to acquire the clock\r\nstruct clk * clk\r\nclock source acquired with devm_clk_get()\r\nNote\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 224 of 228\n\ndrivers must ensure that all clk_enable calls made on this clock source are balanced by clk_disable calls prior to\r\ncalling this function.\r\nclk_put should not be called from within interrupt context.\r\nlong clk_round_rate (struct clk * clk, unsigned long rate)¶\r\nadjust a rate to the exact rate a clock can provide\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nunsigned long rate\r\ndesired clock rate in Hz\r\nDescription\r\nThis answers the question “if I were to pass rate to clk_set_rate() , what clock rate would I end up with?”\r\nwithout changing the hardware in any way. In other words:\r\nrate = clk_round_rate(clk, r);\r\nand:\r\nclk_set_rate(clk, r); rate = clk_get_rate(clk);\r\nare equivalent except the former does not modify the clock hardware in any way.\r\nReturns rounded clock rate in Hz, or negative errno.\r\nint clk_set_rate (struct clk * clk, unsigned long rate)¶\r\nset the clock rate for a clock source\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nunsigned long rate\r\ndesired clock rate in Hz\r\nDescription\r\nReturns success (0) or negative errno.\r\nbool clk_has_parent (struct clk * clk, struct clk * parent)¶\r\ncheck if a clock is a possible parent for another\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 225 of 228\n\nParameters\r\nstruct clk * clk\r\nclock source\r\nstruct clk * parent\r\nparent clock source\r\nDescription\r\nThis function can be used in drivers that need to check that a clock can be the parent of another without actually\r\nchanging the parent.\r\nReturns true if parent is a possible parent for clk, false otherwise.\r\nint clk_set_rate_range (struct clk * clk, unsigned long min, unsigned long max)¶\r\nset a rate range for a clock source\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nunsigned long min\r\ndesired minimum clock rate in Hz, inclusive\r\nunsigned long max\r\ndesired maximum clock rate in Hz, inclusive\r\nDescription\r\nReturns success (0) or negative errno.\r\nint clk_set_min_rate (struct clk * clk, unsigned long rate)¶\r\nset a minimum clock rate for a clock source\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nunsigned long rate\r\ndesired minimum clock rate in Hz, inclusive\r\nDescription\r\nReturns success (0) or negative errno.\r\nint clk_set_max_rate (struct clk * clk, unsigned long rate)¶\r\nset a maximum clock rate for a clock source\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 226 of 228\n\nParameters\r\nstruct clk * clk\r\nclock source\r\nunsigned long rate\r\ndesired maximum clock rate in Hz, inclusive\r\nDescription\r\nReturns success (0) or negative errno.\r\nint clk_set_parent (struct clk * clk, struct clk * parent)¶\r\nset the parent clock source for this clock\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nstruct clk * parent\r\nparent clock source\r\nDescription\r\nReturns success (0) or negative errno.\r\nstruct clk * clk_get_parent (struct clk * clk)¶\r\nget the parent clock source for this clock\r\nParameters\r\nstruct clk * clk\r\nclock source\r\nDescription\r\nReturns struct clk corresponding to parent clock source, or valid IS_ERR() condition containing errno.\r\nstruct clk * clk_get_sys (const char * dev_id, const char * con_id)¶\r\nget a clock based upon the device name\r\nParameters\r\nconst char * dev_id\r\ndevice name\r\nconst char * con_id\r\nconnection ID\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 227 of 228\n\nDescription\r\nReturns a struct clk corresponding to the clock producer, or valid IS_ERR() condition containing errno. The\r\nimplementation uses dev_id and con_id to determine the clock consumer, and thereby the clock producer. In\r\ncontrast to clk_get() this function takes the device name instead of the device itself for identification.\r\nDrivers must assume that the clock source is not enabled.\r\nclk_get_sys should not be called from within interrupt context.\r\nSource: https://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nhttps://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html\r\nPage 228 of 228",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"MITRE"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://www.kernel.org/doc/html/v4.12/core-api/kernel-api.html"
	],
	"report_names": [
		"kernel-api.html"
	],
	"threat_actors": [
		{
			"id": "aa73cd6a-868c-4ae4-a5b2-7cb2c5ad1e9d",
			"created_at": "2022-10-25T16:07:24.139848Z",
			"updated_at": "2026-04-10T02:00:04.878798Z",
			"deleted_at": null,
			"main_name": "Safe",
			"aliases": [],
			"source_name": "ETDA:Safe",
			"tools": [
				"DebugView",
				"LZ77",
				"OpenDoc",
				"SafeDisk",
				"TypeConfig",
				"UPXShell",
				"UsbDoc",
				"UsbExe"
			],
			"source_id": "ETDA",
			"reports": null
		}
	],
	"ts_created_at": 1775434143,
	"ts_updated_at": 1775826745,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/77aa980d7519578731f35e851fd8fd15739e4271.pdf",
		"text": "https://archive.orkl.eu/77aa980d7519578731f35e851fd8fd15739e4271.txt",
		"img": "https://archive.orkl.eu/77aa980d7519578731f35e851fd8fd15739e4271.jpg"
	}
}