System Interfaces Chapter 2 (original) (raw)

The Open Group Base Specifications Issue 6
IEEE Std 1003.1, 2004 Edition

2.8 Realtime

This section defines functions to support the source portability of applications with realtime requirements. The presence of many of these functions is dependent on support for implementation options described in the text.

The specific functional areas included in this section and their scope include the following. Full definitions of these terms can be found in the Base Definitions volume of IEEE Std 1003.1-2001, Chapter 3, Definitions.

Semaphores
Process Memory Locking
Memory Mapped Files and Shared Memory Objects
Priority Scheduling
Realtime Signal Extension
Timers
Interprocess Communication
Synchronized Input and Output
Asynchronous Input and Output

All the realtime functions defined in this volume of IEEE Std 1003.1-2001 are portable, although some of the numeric parameters used by an implementation may have hardware dependencies.

2.8.1 Realtime Signals

[RTS] [Option Start] Realtime signal generation and delivery is dependent on support for the Realtime Signals Extension option. [Option End]

See Realtime Signal Generation and Delivery.

2.8.2 Asynchronous I/O

[AIO] [Option Start] The functionality described in this section is dependent on support of the Asynchronous Input and Output option (and the rest of this section is not further marked for this option). [Option End]

An asynchronous I/O control block structure aiocb is used in many asynchronous I/O functions. It is defined in the Base Definitions volume of IEEE Std 1003.1-2001, <aio.h> and has at least the following members:

Member Type	Member Name	Description
int	aio_fildes	File descriptor.
off_t	aio_offset	File offset.
volatile void*	aio_buf	Location of buffer.
size_t	aio_nbytes	Length of transfer.
int	aio_reqprio	Request priority offset.
struct sigevent	aio_sigevent	Signal number and value.
int	aio_lio_opcode	Operation to be performed.

The aio_fildes element is the file descriptor on which the asynchronous operation is performed.

If O_APPEND is not set for the file descriptor aio_fildes and if aio_fildes is associated with a device that is capable of seeking, then the requested operation takes place at the absolute position in the file as given by aio_offset, as if lseek() were called immediately prior to the operation with an _offset_argument equal to aio_offset and a whence argument equal to SEEK_SET. If O_APPEND is set for the file descriptor, or if aio_fildes is associated with a device that is incapable of seeking, write operations append to the file in the same order as the calls were made, with the following exception: under implementation-defined circumstances, such as operation on a multi-processor or when requests of differing priorities are submitted at the same time, the ordering restriction may be relaxed. Since there is no way for a strictly conforming application to determine whether this relaxation applies, all strictly conforming applications which rely on ordering of output shall be written in such a way that they will operate correctly if the relaxation applies. After a successful call to enqueue an asynchronous I/O operation, the value of the file offset for the file is unspecified. The aio_nbytes and aio_buf elements are the same as the nbyte and buf arguments defined byread() and write(), respectively.

If _POSIX_PRIORITIZED_IO and _POSIX_PRIORITY_SCHEDULING are defined, then asynchronous I/O is queued in priority order, with the priority of each asynchronous operation based on the current scheduling priority of the calling process. The _aio_reqprio_member can be used to lower (but not raise) the asynchronous I/O operation priority and is within the range zero through {AIO_PRIO_DELTA_MAX}, inclusive. Unless both _POSIX_PRIORITIZED_IO and _POSIX_PRIORITY_SCHEDULING are defined, the order of processing asynchronous I/O requests is unspecified. When both _POSIX_PRIORITIZED_IO and _POSIX_PRIORITY_SCHEDULING are defined, the order of processing of requests submitted by processes whose schedulers are not SCHED_FIFO, SCHED_RR, or SCHED_SPORADIC is unspecified. The priority of an asynchronous request is computed as (process scheduling priority) minus aio_reqprio. The priority assigned to each asynchronous I/O request is an indication of the desired order of execution of the request relative to other asynchronous I/O requests for this file. If _POSIX_PRIORITIZED_IO is defined, requests issued with the same priority to a character special file are processed by the underlying device in FIFO order; the order of processing of requests of the same priority issued to files that are not character special files is unspecified. Numerically higher priority values indicate requests of higher priority. The value of aio_reqprio has no effect on process scheduling priority. When prioritized asynchronous I/O requests to the same file are blocked waiting for a resource required for that I/O operation, the higher-priority I/O requests shall be granted the resource before lower-priority I/O requests are granted the resource. The relative priority of asynchronous I/O and synchronous I/O is implementation-defined. If _POSIX_PRIORITIZED_IO is defined, the implementation shall define for which files I/O prioritization is supported.

The aio_sigevent determines how the calling process shall be notified upon I/O completion, as specified in Signal Generation and Delivery. If aio_sigevent.sigev_notify is SIGEV_NONE, then no signal shall be posted upon I/O completion, but the error status for the operation and the return status for the operation shall be set appropriately.

The aio_lio_opcode field is used only by the lio_listio() call. The lio_listio() call allows multiple asynchronous I/O operations to be submitted at a single time. The function takes as an argument an array of pointers to aiocb structures. Each aiocb structure indicates the operation to be performed (read or write) via the aio_lio_opcode field.

The address of the aiocb structure is used as a handle for retrieving the error status and return status of the asynchronous operation while it is in progress.

The aiocb structure and the data buffers associated with the asynchronous I/O operation are being used by the system for asynchronous I/O while, and only while, the error status of the asynchronous operation is equal to [EINPROGRESS]. Applications shall not modify the aiocb structure while the structure is being used by the system for asynchronous I/O.

The return status of the asynchronous operation is the number of bytes transferred by the I/O operation. If the error status is set to indicate an error completion, then the return status is set to the return value that the corresponding read(), write(), or fsync() call would have returned. When the error status is not equal to [EINPROGRESS], the return status shall reflect the return status of the corresponding synchronous operation.

2.8.3 Memory Management

Memory Locking

[MR] [Option Start] Range memory locking operations are defined in terms of pages. Implementations may restrict the size and alignment of range lockings to be on page-size boundaries. The page size, in bytes, is the value of the configurable system variable {PAGESIZE}. If an implementation has no restrictions on size or alignment, it may specify a 1-byte page size. [Option End]

[ML|MLR] [Option Start] Memory locking guarantees the residence of portions of the address space. It is implementation-defined whether locking memory guarantees fixed translation between virtual addresses (as seen by the process) and physical addresses. Per-process memory locks are not inherited across a fork(), and all memory locks owned by a process are unlocked upon exec or process termination. Unmapping of an address range removes any memory locks established on that address range by this process. [Option End]

Memory Mapped Files

[MF] [Option Start] The functionality described in this section is dependent on support of the Memory Mapped Files option (and the rest of this section is not further marked for this option). [Option End]

Range memory mapping operations are defined in terms of pages. Implementations may restrict the size and alignment of range mappings to be on page-size boundaries. The page size, in bytes, is the value of the configurable system variable {PAGESIZE}. If an implementation has no restrictions on size or alignment, it may specify a 1-byte page size.

Memory mapped files provide a mechanism that allows a process to access files by directly incorporating file data into its address space. Once a file is mapped into a process address space, the data can be manipulated as memory. If more than one process maps a file, its contents are shared among them. If the mappings allow shared write access, then data written into the memory object through the address space of one process appears in the address spaces of all processes that similarly map the same portion of the memory object.

[SHM] [Option Start] Shared memory objects are named regions of storage that may be independent of the file system and can be mapped into the address space of one or more processes to allow them to share the associated memory. [Option End]

An unlink() of a file [SHM] [Option Start] or shm_unlink() of a shared memory object, [Option End] while causing the removal of the name, does not unmap any mappings established for the object. Once the name has been removed, the contents of the memory object are preserved as long as it is referenced. The memory object remains referenced as long as a process has the memory object open or has some area of the memory object mapped.

Memory Protection

[MPR MF] [Option Start] The functionality described in this section is dependent on support of the Memory Protection and Memory Mapped Files option (and the rest of this section is not further marked for these options). [Option End]

When an object is mapped, various application accesses to the mapped region may result in signals. In this context, SIGBUS is used to indicate an error using the mapped object, and SIGSEGV is used to indicate a protection violation or misuse of an address:

A mapping may be restricted to disallow some types of access.
Write attempts to memory that was mapped without write access, or any access to memory mapped PROT_NONE, shall result in a SIGSEGV signal.
References to unmapped addresses shall result in a SIGSEGV signal.
Reference to whole pages within the mapping, but beyond the current length of the object, shall result in a SIGBUS signal.
The size of the object is unaffected by access beyond the end of the object (even if a SIGBUS is not generated).

Typed Memory Objects

[TYM] [Option Start] The functionality described in this section is dependent on support of the Typed Memory Objects option (and the rest of this section is not further marked for this option). [Option End]

Implementations may support the Typed Memory Objects option without supporting the Memory Mapped Files option or the Shared Memory Objects option. Typed memory objects are implementation-configurable named storage pools accessible from one or more processors in a system, each via one or more ports, such as backplane buses, LANs, I/O channels, and so on. Each valid combination of a storage pool and a port is identified through a name that is defined at system configuration time, in an implementation-defined manner; the name may be independent of the file system. Using this name, a typed memory object can be opened and mapped into process address space. For a given storage pool and port, it is necessary to support both dynamic allocation from the pool as well as mapping at an application-supplied offset within the pool; when dynamic allocation has been performed, subsequent deallocation must be supported. Lastly, accessing typed memory objects from different ports requires a method for obtaining the offset and length of contiguous storage of a region of typed memory (dynamically allocated or not); this allows typed memory to be shared among processes and/or processors while being accessed from the desired port.

2.8.4 Process Scheduling

[PS] [Option Start] The functionality described in this section is dependent on support of the Process Scheduling option (and the rest of this section is not further marked for this option). [Option End]

Scheduling Policies

The scheduling semantics described in this volume of IEEE Std 1003.1-2001 are defined in terms of a conceptual model that contains a set of thread lists. No implementation structures are necessarily implied by the use of this conceptual model. It is assumed that no time elapses during operations described using this model, and therefore no simultaneous operations are possible. This model discusses only processor scheduling for runnable threads, but it should be noted that greatly enhanced predictability of realtime applications results if the sequencing of other resources takes processor scheduling policy into account.

There is, conceptually, one thread list for each priority. A runnable thread will be on the thread list for that thread's priority. Multiple scheduling policies shall be provided. Each non-empty thread list is ordered, contains a head as one end of its order, and a tail as the other. The purpose of a scheduling policy is to define the allowable operations on this set of lists (for example, moving threads between and within lists).

Each process shall be controlled by an associated scheduling policy and priority. These parameters may be specified by explicit application execution of the sched_setscheduler() or sched_setparam() functions.

Each thread shall be controlled by an associated scheduling policy and priority. These parameters may be specified by explicit application execution of the pthread_setschedparam() function.

Associated with each policy is a priority range. Each policy definition shall specify the minimum priority range for that policy. The priority ranges for each policy may but need not overlap the priority ranges of other policies.

A conforming implementation shall select the thread that is defined as being at the head of the highest priority non-empty thread list to become a running thread, regardless of its associated policy. This thread is then removed from its thread list.

Four scheduling policies are specifically required. Other implementation-defined scheduling policies may be defined. The following symbols are defined in the Base Definitions volume of IEEE Std 1003.1-2001, <sched.h>:

SCHED_FIFO

First in, first out (FIFO) scheduling policy.

SCHED_RR

Round robin scheduling policy.

SCHED_SPORADIC

[SS] [Option Start] Sporadic server scheduling policy. [Option End]

SCHED_OTHER

Another scheduling policy.

The values of these symbols shall be distinct.

SCHED_FIFO

Conforming implementations shall include a scheduling policy called the FIFO scheduling policy.

Threads scheduled under this policy are chosen from a thread list that is ordered by the time its threads have been on the list without being executed; generally, the head of the list is the thread that has been on the list the longest time, and the tail is the thread that has been on the list the shortest time.

Under the SCHED_FIFO policy, the modification of the definitional thread lists is as follows:

When a running thread becomes a preempted thread, it becomes the head of the thread list for its priority.
When a blocked thread becomes a runnable thread, it becomes the tail of the thread list for its priority.
When a running thread calls the sched_setscheduler() function, the process specified in the function call is modified to the specified policy and the priority specified by the _param_argument.
When a running thread calls the sched_setparam() function, the priority of the process specified in the function call is modified to the priority specified by the param argument.
When a running thread calls the pthread_setschedparam() function, the thread specified in the function call is modified to the specified policy and the priority specified by the _param_argument.
When a running thread calls the pthread_setschedprio() function, the thread specified in the function call is modified to the priority specified by the prio argument.
If a thread whose policy or priority has been modified other than by pthread_setschedprio() is a running thread or is runnable, it then becomes the tail of the thread list for its new priority.
If a thread whose policy or priority has been modified by pthread_setschedprio() is a running thread or is runnable, the effect on its position in the thread list depends on the direction of the modification, as follows:
1. If the priority is raised, the thread becomes the tail of the thread list.
2. If the priority is unchanged, the thread does not change position in the thread list.
3. If the priority is lowered, the thread becomes the head of the thread list.
When a running thread issues the sched_yield() function, the thread becomes the tail of the thread list for its priority.
At no other time is the position of a thread with this scheduling policy within the thread lists affected.

For this policy, valid priorities shall be within the range returned by the sched_get_priority_max() and sched_get_priority_min() functions when SCHED_FIFO is provided as the parameter. Conforming implementations shall provide a priority range of at least 32 priorities for this policy.

SCHED_RR

Conforming implementations shall include a scheduling policy called the ``round robin'' scheduling policy. This policy shall be identical to the SCHED_FIFO policy with the additional condition that when the implementation detects that a running thread has been executing as a running thread for a time period of the length returned by the sched_rr_get_interval() function or longer, the thread shall become the tail of its thread list and the head of that thread list shall be removed and made a running thread.

The effect of this policy is to ensure that if there are multiple SCHED_RR threads at the same priority, one of them does not monopolize the processor. An application should not rely only on the use of SCHED_RR to ensure application progress among multiple threads if the application includes threads using the SCHED_FIFO policy at the same or higher priority levels or SCHED_RR threads at a higher priority level.

A thread under this policy that is preempted and subsequently resumes execution as a running thread completes the unexpired portion of its round robin interval time period.

For this policy, valid priorities shall be within the range returned by the sched_get_priority_max() and sched_get_priority_min() functions when SCHED_RR is provided as the parameter. Conforming implementations shall provide a priority range of at least 32 priorities for this policy.

SCHED_SPORADIC

[SS|TSP] [Option Start] The functionality described in this section is dependent on support of the Process Sporadic Server or Thread Sporadic Server options (and the rest of this section is not further marked for these options). [Option End]

If _POSIX_SPORADIC_SERVER or _POSIX_THREAD_SPORADIC_SERVER is defined, the implementation shall include a scheduling policy identified by the value SCHED_SPORADIC.

The sporadic server policy is based primarily on two parameters: the replenishment period and the available execution capacity. The replenishment period is given by the sched_ss_repl_period member of the sched_param structure. The available execution capacity is initialized to the value given by the sched_ss_init_budget member of the same parameter. The sporadic server policy is identical to the SCHED_FIFO policy with some additional conditions that cause the thread's assigned priority to be switched between the values specified by the sched_priority and sched_ss_low_priority members of thesched_param structure.

The priority assigned to a thread using the sporadic server scheduling policy is determined in the following manner: if the available execution capacity is greater than zero and the number of pending replenishment operations is strictly less than_sched_ss_max_repl_, the thread is assigned the priority specified by sched_priority; otherwise, the assigned priority shall be sched_ss_low_priority. If the value of sched_priority is less than or equal to the value of_sched_ss_low_priority_, the results are undefined. When active, the thread shall belong to the thread list corresponding to its assigned priority level, according to the mentioned priority assignment. The modification of the available execution capacity and, consequently of the assigned priority, is done as follows:

When the thread at the head of the sched_priority list becomes a running thread, its execution time shall be limited to at most its available execution capacity, plus the resolution of the execution time clock used for this scheduling policy. This resolution shall be implementation-defined.
Each time the thread is inserted at the tail of the list associated with _sched_priority_- because as a blocked thread it became runnable with priority sched_priority or because a replenishment operation was performed-the time at which this operation is done is posted as the activation_time.
When the running thread with assigned priority equal to sched_priority becomes a preempted thread, it becomes the head of the thread list for its priority, and the execution time consumed is subtracted from the available execution capacity. If the available execution capacity would become negative by this operation, it shall be set to zero.
When the running thread with assigned priority equal to sched_priority becomes a blocked thread, the execution time consumed is subtracted from the available execution capacity, and a replenishment operation is scheduled, as described in 6 and 7. If the available execution capacity would become negative by this operation, it shall be set to zero.
When the running thread with assigned priority equal to sched_priority reaches the limit imposed on its execution time, it becomes the tail of the thread list for sched_ss_low_priority, the execution time consumed is subtracted from the available execution capacity (which becomes zero), and a replenishment operation is scheduled, as described in 6 and 7.
Each time a replenishment operation is scheduled, the amount of execution capacity to be replenished, replenish_amount, is set equal to the execution time consumed by the thread since the activation_time. The replenishment is scheduled to occur at activation_time plus sched_ss_repl_period. If the scheduled time obtained is before the current time, the replenishment operation is carried out immediately. Several replenishment operations may be pending at the same time, each of which will be serviced at its respective scheduled time. With the above rules, the number of replenishment operations simultaneously pending for a given thread that is scheduled under the sporadic server policy shall not be greater than_sched_ss_max_repl_.
A replenishment operation consists of adding the corresponding replenish_amount to the available execution capacity at the scheduled time. If, as a consequence of this operation, the execution capacity would become larger than_sched_ss_initial_budget_, it shall be rounded down to a value equal to sched_ss_initial_budget. Additionally, if the thread was runnable or running, and had assigned priority equal to sched_ss_low_priority, then it becomes the tail of the thread list for sched_priority.

Execution time is defined in The Name Space.

For this policy, changing the value of a CPU-time clock via clock_settime()shall have no effect on its behavior.

For this policy, valid priorities shall be within the range returned by the sched_get_priority_min() and sched_get_priority_max() functions when SCHED_SPORADIC is provided as the parameter. Conforming implementations shall provide a priority range of at least 32 distinct priorities for this policy.

SCHED_OTHER

Conforming implementations shall include one scheduling policy identified as SCHED_OTHER (which may execute identically with either the FIFO or round robin scheduling policy). The effect of scheduling threads with the SCHED_OTHER policy in a system in which other threads are executing under SCHED_FIFO, SCHED_RR, [SS] [Option Start] or SCHED_SPORADIC [Option End] is implementation-defined.

This policy is defined to allow strictly conforming applications to be able to indicate in a portable manner that they no longer need a realtime scheduling policy.

For threads executing under this policy, the implementation shall use only priorities within the range returned by the sched_get_priority_max() and sched_get_priority_min() functions when SCHED_OTHER is provided as the parameter.

2.8.5 Clocks and Timers

[TMR] [Option Start] The functionality described in this section is dependent on support of the Timers option (and the rest of this section is not further shaded for this option). [Option End]

The <time.h> header defines the types and manifest constants used by the timing facility.

Time Value Specification Structures

Many of the timing facility functions accept or return time value specifications. A time value structure timespecspecifies a single time value and includes at least the following members:

Member Type	Member Name	Description
time_t	tv_sec	Seconds.
long	tv_nsec	Nanoseconds.

The tv_nsec member is only valid if greater than or equal to zero, and less than the number of nanoseconds in a second (1000 million). The time interval described by this structure is (tv_sec * 109 + tv_nsec) nanoseconds.

A time value structure itimerspec specifies an initial timer value and a repetition interval for use by the per-process timer functions. This structure includes at least the following members:

Member Type	Member Name	Description
struct timespec	it_interval	Timer period.
struct timespec	it_value	Timer expiration.

If the value described by it_value is non-zero, it indicates the time to or time of the next timer expiration (for relative and absolute timer values, respectively). If the value described by it_value is zero, the timer shall be disarmed.

If the value described by it_interval is non-zero, it specifies an interval which shall be used in reloading the timer when it expires; that is, a periodic timer is specified. If the value described by it_interval is zero, the timer is disarmed after its next expiration; that is, a one-shot timer is specified.

Timer Event Notification Control Block

[RTS] [Option Start] Per-process timers may be created that notify the process of timer expirations by queuing a realtime extended signal. Thesigevent structure, defined in the Base Definitions volume of IEEE Std 1003.1-2001, <signal.h>, is used in creating such a timer. The sigevent structure contains the signal number and an application-specific data value which shall be used when notifying the calling process of timer expiration events. [Option End]

Manifest Constants

The following constants are defined in the Base Definitions volume of IEEE Std 1003.1-2001, <time.h>:

CLOCK_REALTIME

The identifier for the system-wide realtime clock.

TIMER_ABSTIME

Flag indicating time is absolute with respect to the clock associated with a timer.

CLOCK_MONOTONIC

[MON] [Option Start] The identifier for the system-wide monotonic clock, which is defined as a clock whose value cannot be set via clock_settime() and which cannot have backward clock jumps. The maximum possible clock jump is implementation-defined. [Option End]

The maximum allowable resolution for CLOCK_REALTIME and [MON] [Option Start] CLOCK_MONOTONIC [Option End] clocks and all time services based on these clocks is represented by {_POSIX_CLOCKRES_MIN} and shall be defined as 20 ms (1/50 of a second). Implementations may support smaller values of resolution for these clocks to provide finer granularity time bases. The actual resolution supported by an implementation for a specific clock is obtained using the clock_getres() function. If the actual resolution supported for a time service based on one of these clocks differs from the resolution supported for that clock, the implementation shall document this difference.

The minimum allowable maximum value for CLOCK_REALTIME and [MON] [Option Start] CLOCK_MONOTONIC [Option End] clocks and all absolute time services based on them is the same as that defined by the ISO C standard for the time_t type. If the maximum value supported by a time service based on one of these clocks differs from the maximum value supported by that clock, the implementation shall document this difference.

Execution Time Monitoring

[CPT] [Option Start] If _POSIX_CPUTIME is defined, process CPU-time clocks shall be supported in addition to the clocks described in Manifest Constants. [Option End]

[TCT] [Option Start] If _POSIX_THREAD_CPUTIME is defined, thread CPU-time clocks shall be supported. [Option End]

[CPT|TCT] [Option Start] CPU-time clocks measure execution or CPU time, which is defined in the Base Definitions volume of IEEE Std 1003.1-2001, Section 3.117, CPU Time (Execution Time). The mechanism used to measure execution time is described in the Base Definitions volume of IEEE Std 1003.1-2001, Section 4.9, Measurement of Execution Time. [Option End]

[CPT] [Option Start] If _POSIX_CPUTIME is defined, the following constant of the type clockid_t is defined in <time.h>:

CLOCK_PROCESS_CPUTIME_ID

When this value of the type clockid_t is used in a clock() or timer*() function call, it is interpreted as the identifier of the CPU-time clock associated with the process making the function call.

[Option End]

[TCT] [Option Start] If _POSIX_THREAD_CPUTIME is defined, the following constant of the type clockid_t is defined in <time.h>:

CLOCK_THREAD_CPUTIME_ID

When this value of the type clockid_t is used in a clock() or timer*() function call, it is interpreted as the identifier of the CPU-time clock associated with the thread making the function call. [Option End]

UNIX ® is a registered Trademark of The Open Group.
POSIX ® is a registered Trademark of The IEEE.
[ Main Index | XBD | XCU | XSH | XRAT]